You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Simon Weller <sw...@ena.com.INVALID> on 2017/11/01 14:10:41 UTC

Re: Problems with KVM HA & STONITH

James,


Try just configuring a single NFS server and see if your setup works. If you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is picking the one you rebooted as the storage for your VMs and when that storage goes away (when you bounce the host), all storage for your VMs vanishes and ACS tries to reboot your other hosts.


Normally in a simple ACS setup, you would have a separate storage server that can serve up NFS to all hosts. If a host dies, then a VM would be brought up on a spare hosts since all hosts have access to the same storage.

Your other option is to use local storage, but that won't provide HA.


- Si


________________________________
From: McClune, James <mc...@norwalktruckers.net>
Sent: Monday, October 30, 2017 2:26 PM
To: users@cloudstack.apache.org
Subject: Re: Problems with KVM HA & STONITH

Hi Dag,

Thank you for responding back. I am currently running ACS 4.9 on an Ubuntu
14.04 VM. I have the three nodes, each having about 1TB of primary storage
(NFS) and 1TB of secondary storage (NFS). I added each NFS share into ACS.
All nodes are in a cluster.

Maybe I'm not understanding the setup or misconfigured something. I'm
trying to setup an HA environment where if one node goes down, running an
HA marked VM, the VM will start on another host. When I simulate a network
disconnect or reboot of a host, all of the nodes go down (STONITH?).

I am unsure on how to setup an HA environment, if all the nodes in the
cluster go down. Any help is much appreciated!

Thanks,
James

On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <Da...@shapeblue.com>
wrote:

> Hi James,
>
> I think  you possibly have over-configured your KVM hosts. If you use NFS
> (and no clustered file system like CLVM) then there should be no need to
> configure STONITH. CloudStack takes care of your HA, so this is not
> something you offload to the KVM host.
>
> (As mentioned the only time I have played with STONITH and CloudStack was
> for CLVM – and I eventually found it not fit for purpose, too unstable and
> causing too many issues like you describe. Note this was for block storage
> though – not NFS).
>
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
> On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com> wrote:
>
>     Hi. If the node losts nfs host it reboots (acs agent behaviour). If you
>     really have 3 storages, you'll go clusterwide reboot everytime your
> host is
>     down.
>
>     28 окт. 2017 г. 3:02 пользователь "Simon Weller"
> <sw...@ena.com.invalid>
>     написал:
>
>     > Hi James,
>     >
>     >
>     > Can you elaborate a bit further on the storage? You say you're
> running NFS
>     > on all 3 nodes, can you explain how it is setup?
>     >
>     > Also, what version of ACS are you running?
>     >
>     >
>     > - Si
>     >
>     >
>     >
>     >
>     > ________________________________
>     > From: McClune, James <mc...@norwalktruckers.net>
>     > Sent: Friday, October 27, 2017 2:21 PM
>     > To: users@cloudstack.apache.org
>     > Subject: Problems with KVM HA & STONITH
>     >
>     > Hello Apache CloudStack Community,
>     >
>     > My setup consists of the following:
>     >
>     > - Three nodes (NODE1, NODE2, and NODE3)
>     > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu 16.04.3,
> and NODE3
>     > is running Ubuntu 14.04.5.
>     > - Management Server (running on separate VM, not in cluster)
>     >
>     > The three nodes use KVM as the hypervisor. I also configured primary
> and
>     > secondary storage on all three of the nodes. I'm using NFS for the
> primary
>     > & secondary storage. VM operations work great. Live migration works
> great.
>     >
>     > However, when a host goes down, the HA functionality does not work
> at all.
>     > Instead of spinning up the VM on another available host, the down
> host
>     > seems to trigger STONITH. When STONITH happens, all hosts in the
> cluster go
>     > down. This not only causes no HA, but also downs perfectly good
> VM's. I
>     > have read countless articles and documentation related to this
> issue. I
>     > still cannot find a viable solution for this issue. I really want to
> use
>     > Apache CloudStack, but cannot implement this in production when
> STONITH
>     > happens.
>     >
>     > I think I have something misconfigured. I thought I would reach out
> to the
>     > CloudStack community and ask for some friendly assistance.
>     >
>     > If there is anything (system-wise) you request in order to further
>     > troubleshoot this issue, please let me know and I'll send. I
> appreciate any
>     > help in this issue!
>     >
>     > --
>     >
>     > Thanks,
>     >
>     > James
>     >
>
>
>
> Dag.Sonstebo@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>


--



James McClune

Technical Support Specialist

Norwalk City Schools

Phone: 419-660-6590

mcclunej@norwalktruckers.net

Re: Problems with KVM HA & STONITH

Posted by Ivan Kudryavtsev <ku...@bw-sw.com>.
Also you can run ceph if you need HA. I met setup description which uses
compute nodes for ceph cluster nodes simultaneously.

1 нояб. 2017 г. 21:11 пользователь "Simon Weller" <sw...@ena.com.invalid>
написал:

> James,
>
>
> Try just configuring a single NFS server and see if your setup works. If
> you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is
> picking the one you rebooted as the storage for your VMs and when that
> storage goes away (when you bounce the host), all storage for your VMs
> vanishes and ACS tries to reboot your other hosts.
>
>
> Normally in a simple ACS setup, you would have a separate storage server
> that can serve up NFS to all hosts. If a host dies, then a VM would be
> brought up on a spare hosts since all hosts have access to the same storage.
>
> Your other option is to use local storage, but that won't provide HA.
>
>
> - Si
>
>
> ________________________________
> From: McClune, James <mc...@norwalktruckers.net>
> Sent: Monday, October 30, 2017 2:26 PM
> To: users@cloudstack.apache.org
> Subject: Re: Problems with KVM HA & STONITH
>
> Hi Dag,
>
> Thank you for responding back. I am currently running ACS 4.9 on an Ubuntu
> 14.04 VM. I have the three nodes, each having about 1TB of primary storage
> (NFS) and 1TB of secondary storage (NFS). I added each NFS share into ACS.
> All nodes are in a cluster.
>
> Maybe I'm not understanding the setup or misconfigured something. I'm
> trying to setup an HA environment where if one node goes down, running an
> HA marked VM, the VM will start on another host. When I simulate a network
> disconnect or reboot of a host, all of the nodes go down (STONITH?).
>
> I am unsure on how to setup an HA environment, if all the nodes in the
> cluster go down. Any help is much appreciated!
>
> Thanks,
> James
>
> On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <Da...@shapeblue.com>
> wrote:
>
> > Hi James,
> >
> > I think  you possibly have over-configured your KVM hosts. If you use NFS
> > (and no clustered file system like CLVM) then there should be no need to
> > configure STONITH. CloudStack takes care of your HA, so this is not
> > something you offload to the KVM host.
> >
> > (As mentioned the only time I have played with STONITH and CloudStack was
> > for CLVM – and I eventually found it not fit for purpose, too unstable
> and
> > causing too many issues like you describe. Note this was for block
> storage
> > though – not NFS).
> >
> > Regards,
> > Dag Sonstebo
> > Cloud Architect
> > ShapeBlue
> >
> > On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com>
> wrote:
> >
> >     Hi. If the node losts nfs host it reboots (acs agent behaviour). If
> you
> >     really have 3 storages, you'll go clusterwide reboot everytime your
> > host is
> >     down.
> >
> >     28 окт. 2017 г. 3:02 пользователь "Simon Weller"
> > <sw...@ena.com.invalid>
> >     написал:
> >
> >     > Hi James,
> >     >
> >     >
> >     > Can you elaborate a bit further on the storage? You say you're
> > running NFS
> >     > on all 3 nodes, can you explain how it is setup?
> >     >
> >     > Also, what version of ACS are you running?
> >     >
> >     >
> >     > - Si
> >     >
> >     >
> >     >
> >     >
> >     > ________________________________
> >     > From: McClune, James <mc...@norwalktruckers.net>
> >     > Sent: Friday, October 27, 2017 2:21 PM
> >     > To: users@cloudstack.apache.org
> >     > Subject: Problems with KVM HA & STONITH
> >     >
> >     > Hello Apache CloudStack Community,
> >     >
> >     > My setup consists of the following:
> >     >
> >     > - Three nodes (NODE1, NODE2, and NODE3)
> >     > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu 16.04.3,
> > and NODE3
> >     > is running Ubuntu 14.04.5.
> >     > - Management Server (running on separate VM, not in cluster)
> >     >
> >     > The three nodes use KVM as the hypervisor. I also configured
> primary
> > and
> >     > secondary storage on all three of the nodes. I'm using NFS for the
> > primary
> >     > & secondary storage. VM operations work great. Live migration works
> > great.
> >     >
> >     > However, when a host goes down, the HA functionality does not work
> > at all.
> >     > Instead of spinning up the VM on another available host, the down
> > host
> >     > seems to trigger STONITH. When STONITH happens, all hosts in the
> > cluster go
> >     > down. This not only causes no HA, but also downs perfectly good
> > VM's. I
> >     > have read countless articles and documentation related to this
> > issue. I
> >     > still cannot find a viable solution for this issue. I really want
> to
> > use
> >     > Apache CloudStack, but cannot implement this in production when
> > STONITH
> >     > happens.
> >     >
> >     > I think I have something misconfigured. I thought I would reach out
> > to the
> >     > CloudStack community and ask for some friendly assistance.
> >     >
> >     > If there is anything (system-wise) you request in order to further
> >     > troubleshoot this issue, please let me know and I'll send. I
> > appreciate any
> >     > help in this issue!
> >     >
> >     > --
> >     >
> >     > Thanks,
> >     >
> >     > James
> >     >
> >
> >
> >
> > Dag.Sonstebo@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
>
>
> --
>
>
>
> James McClune
>
> Technical Support Specialist
>
> Norwalk City Schools
>
> Phone: 419-660-6590
>
> mcclunej@norwalktruckers.net
>

Re: Problems with KVM HA & STONITH

Posted by "McClune, James" <mc...@norwalktruckers.net>.
Hi Victor,

If I may interject, I read your email and understand you're running KVM
with Ceph storage. As I far I know, ACS only supports HA on NFS or iSCSI
primary storage.

http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/4.11/reliability.html

However, if you wanted to use Ceph, you could create an RBD block device
and export it over NFS. Here is an article I referenced in the past:

https://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/

You could then add that NFS storage into ACS and utilize HA. I hope I'm
understanding you correctly.

Best Regards,
James

On Thu, Apr 5, 2018 at 12:53 PM, victor <vi...@ihnetworks.com> wrote:

> Hello Boris,
>
> I am able to create VM with nfs+Ha and nfs without HA. The issue is with
> creating VM with Ceph  storage.
>
> Regards
> Victor
>
>
>
> On 04/05/2018 01:18 PM, Boris Stoyanov wrote:
>
>> Hi Victor,
>> Host HA is working only with KVM + NFS. Ceph is not supported at this
>> stage. Obviously RAW volumes are not supported on your pool, but I’m not
>> sure if that’s because of Ceph or HA in general. Are you able to deploy a
>> non-ha VM?
>>
>> Boris Stoyanov
>>
>>
>> boris.stoyanov@shapeblue.com
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>> @shapeblue
>>
>>
>>> On 5 Apr 2018, at 4:19, victor <vi...@ihnetworks.com> wrote:
>>>
>>> Hello Rohit,
>>>
>>> Is the Host HA provider start working with Ceph. The reason I am asking
>>> is because, I am not able to create a VM with Ceph storage in a kvm host
>>> with HA enabled and I am getting the following error while creating VM.
>>>
>>> ============
>>> .cloud.exception.StorageUnavailableException: Resource [StoragePool:2]
>>> is unreachable: Unable to create Vol[9|vm=6|DATADISK]:com.cloud
>>> .utils.exception.CloudRuntimeException: org.libvirt.LibvirtException:
>>> unsupported configuration: only RAW volumes are supported by this storage
>>> pool
>>> ============
>>>
>>> Regards
>>> Victor
>>>
>>> On 11/04/2017 09:53 PM, Rohit Yadav wrote:
>>>
>>>> Hi James, (/cc Simon and others),
>>>>
>>>>
>>>> A new feature exists in upcoming ACS 4.11, Host HA:
>>>>
>>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA
>>>>
>>>> You can read more about it here as well: http://www.shapeblue.com/host-
>>>> ha-for-kvm-hosts-in-cloudstack/
>>>>
>>>> This feature can use a custom HA provider, with default HA provider
>>>> implemented for KVM and NFS, and uses ipmi based fencing (STONITH) of the
>>>> host. The current HA mechanism provides no such method of fencing (powering
>>>> off) a host and it depends under what circumstances the VM HA is failing
>>>> (environment issues, ACS version etc).
>>>>
>>>> As Simon mentioned, we have a (host) HA provider that works with Ceph
>>>> in near future.
>>>>
>>>> Regards.
>>>>
>>>> ________________________________
>>>> From: Simon Weller <sw...@ena.com.INVALID>
>>>> Sent: Thursday, November 2, 2017 7:27:22 PM
>>>> To: users@cloudstack.apache.org
>>>> Subject: Re: Problems with KVM HA & STONITH
>>>>
>>>> James,
>>>>
>>>>
>>>> Ceph is a great solution and we run all of our ACS storage on Ceph.
>>>> Note that it adds another layer of complexity to your installation, so
>>>> you're going need to develop some expertise with that platform to get
>>>> comfortable with how it works. Typically you don't want to mix Ceph with
>>>> your ACS hosts. We in fact deploy 3 separate Ceph Monitors, and then scale
>>>> OSDs as required on a per cluster basis in order to add additional
>>>> resiliency (So every KVM ACS cluster has it's own Ceph "POD").  We also use
>>>> Ceph for S3 storage (on completely separate Ceph clusters) for some other
>>>> services.
>>>>
>>>>
>>>> NFS is much simpler to maintain for smaller installations in my
>>>> opinion. If the IO load you're looking at isn't going to be insanely high,
>>>> you could look at building a 2 node NFS cluster using pacemaker and DRDB
>>>> for data replication between nodes. That would reduce your storage
>>>> requirement to 2 fairly low power servers (NFS is not very cpu intensive).
>>>> Currently on a host failure when using a storage other than NFS on KVM, you
>>>> will not see HA occur until you take the failed host out of the ACS
>>>> cluster. This is a historical limitation because ACS could not confirm the
>>>> host had been fenced correctly, so to avoid potential data corruption (due
>>>> to 2 hosts mounting the same storage), it doesn't do anything until the
>>>> operator intervenes. As of ACS 4.10, IPMI based fencing is now supported on
>>>> NFS and we're planning on developing similar support for Ceph.
>>>>
>>>>
>>>> Since you're an school district, I'm more than happy to jump on the
>>>> phone with you to talk you through these options if you'd like.
>>>>
>>>>
>>>> - Si
>>>>
>>>>
>>>> ________________________________
>>>> From: McClune, James <mc...@norwalktruckers.net>
>>>> Sent: Thursday, November 2, 2017 8:28 AM
>>>> To: users@cloudstack.apache.org
>>>> Subject: Re: Problems with KVM HA & STONITH
>>>>
>>>> Hi Simon,
>>>>
>>>> Thanks for getting back to me. I created one single NFS share and added
>>>> it
>>>> as primary storage. I think I better understand how the storage works,
>>>> with
>>>> ACS.
>>>>
>>>> I was able to get HA working with one NFS storage, which is good.
>>>> However,
>>>> is there a way to incorporate multiple NFS storage pools and still have
>>>> the
>>>> HA functionality? I think something like GlusterFS or Ceph (like Ivan
>>>> and
>>>> Dag described) will work better.
>>>>
>>>> Thank you Simon, Ivan, and Dag for your assistance!
>>>> James
>>>>
>>>> On Wed, Nov 1, 2017 at 10:10 AM, Simon Weller <sw...@ena.com.invalid>
>>>> wrote:
>>>>
>>>> James,
>>>>>
>>>>>
>>>>> Try just configuring a single NFS server and see if your setup works.
>>>>> If
>>>>> you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is
>>>>> picking the one you rebooted as the storage for your VMs and when that
>>>>> storage goes away (when you bounce the host), all storage for your VMs
>>>>> vanishes and ACS tries to reboot your other hosts.
>>>>>
>>>>>
>>>>> Normally in a simple ACS setup, you would have a separate storage
>>>>> server
>>>>> that can serve up NFS to all hosts. If a host dies, then a VM would be
>>>>> brought up on a spare hosts since all hosts have access to the same
>>>>> storage.
>>>>>
>>>>> Your other option is to use local storage, but that won't provide HA.
>>>>>
>>>>>
>>>>> - Si
>>>>>
>>>>>
>>>>> ________________________________
>>>>> From: McClune, James <mc...@norwalktruckers.net>
>>>>> Sent: Monday, October 30, 2017 2:26 PM
>>>>> To: users@cloudstack.apache.org
>>>>> Subject: Re: Problems with KVM HA & STONITH
>>>>>
>>>>> Hi Dag,
>>>>>
>>>>> Thank you for responding back. I am currently running ACS 4.9 on an
>>>>> Ubuntu
>>>>> 14.04 VM. I have the three nodes, each having about 1TB of primary
>>>>> storage
>>>>> (NFS) and 1TB of secondary storage (NFS). I added each NFS share into
>>>>> ACS.
>>>>> All nodes are in a cluster.
>>>>>
>>>>> Maybe I'm not understanding the setup or misconfigured something. I'm
>>>>> trying to setup an HA environment where if one node goes down, running
>>>>> an
>>>>> HA marked VM, the VM will start on another host. When I simulate a
>>>>> network
>>>>> disconnect or reboot of a host, all of the nodes go down (STONITH?).
>>>>>
>>>>> I am unsure on how to setup an HA environment, if all the nodes in the
>>>>> cluster go down. Any help is much appreciated!
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>>> On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <
>>>>> Dag.Sonstebo@shapeblue.com>
>>>>> wrote:
>>>>>
>>>>> Hi James,
>>>>>>
>>>>>> I think  you possibly have over-configured your KVM hosts. If you use
>>>>>> NFS
>>>>>> (and no clustered file system like CLVM) then there should be no need
>>>>>> to
>>>>>> configure STONITH. CloudStack takes care of your HA, so this is not
>>>>>> something you offload to the KVM host.
>>>>>>
>>>>>> (As mentioned the only time I have played with STONITH and CloudStack
>>>>>> was
>>>>>> for CLVM – and I eventually found it not fit for purpose, too unstable
>>>>>>
>>>>> and
>>>>>
>>>>>> causing too many issues like you describe. Note this was for block
>>>>>>
>>>>> storage
>>>>>
>>>>>> though – not NFS).
>>>>>>
>>>>>> Regards,
>>>>>> Dag Sonstebo
>>>>>> Cloud Architect
>>>>>> ShapeBlue
>>>>>>
>>>>>> On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com>
>>>>>>
>>>>> wrote:
>>>>>
>>>>>>      Hi. If the node losts nfs host it reboots (acs agent behaviour).
>>>>>> If
>>>>>>
>>>>> you
>>>>>
>>>>>>      really have 3 storages, you'll go clusterwide reboot everytime
>>>>>> your
>>>>>> host is
>>>>>>      down.
>>>>>>
>>>>>>      28 окт. 2017 г. 3:02 пользователь "Simon Weller"
>>>>>> <sw...@ena.com.invalid>
>>>>>>      написал:
>>>>>>
>>>>>>      > Hi James,
>>>>>>      >
>>>>>>      >
>>>>>>      > Can you elaborate a bit further on the storage? You say you're
>>>>>> running NFS
>>>>>>      > on all 3 nodes, can you explain how it is setup?
>>>>>>      >
>>>>>>      > Also, what version of ACS are you running?
>>>>>>      >
>>>>>>      >
>>>>>>      > - Si
>>>>>>      >
>>>>>>      >
>>>>>>      >
>>>>>>      >
>>>>>>      > ________________________________
>>>>>>      > From: McClune, James <mc...@norwalktruckers.net>
>>>>>>      > Sent: Friday, October 27, 2017 2:21 PM
>>>>>>      > To: users@cloudstack.apache.org
>>>>>>      > Subject: Problems with KVM HA & STONITH
>>>>>>      >
>>>>>>      > Hello Apache CloudStack Community,
>>>>>>      >
>>>>>>      > My setup consists of the following:
>>>>>>      >
>>>>>>      > - Three nodes (NODE1, NODE2, and NODE3)
>>>>>>      > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu
>>>>>> 16.04.3,
>>>>>> and NODE3
>>>>>>      > is running Ubuntu 14.04.5.
>>>>>>      > - Management Server (running on separate VM, not in cluster)
>>>>>>      >
>>>>>>      > The three nodes use KVM as the hypervisor. I also configured
>>>>>>
>>>>> primary
>>>>>
>>>>>> and
>>>>>>      > secondary storage on all three of the nodes. I'm using NFS for
>>>>>> the
>>>>>> primary
>>>>>>      > & secondary storage. VM operations work great. Live migration
>>>>>> works
>>>>>> great.
>>>>>>      >
>>>>>>      > However, when a host goes down, the HA functionality does not
>>>>>> work
>>>>>> at all.
>>>>>>      > Instead of spinning up the VM on another available host, the
>>>>>> down
>>>>>> host
>>>>>>      > seems to trigger STONITH. When STONITH happens, all hosts in
>>>>>> the
>>>>>> cluster go
>>>>>>      > down. This not only causes no HA, but also downs perfectly good
>>>>>> VM's. I
>>>>>>      > have read countless articles and documentation related to this
>>>>>> issue. I
>>>>>>      > still cannot find a viable solution for this issue. I really
>>>>>> want
>>>>>>
>>>>> to
>>>>>
>>>>>> use
>>>>>>      > Apache CloudStack, but cannot implement this in production when
>>>>>> STONITH
>>>>>>      > happens.
>>>>>>      >
>>>>>>      > I think I have something misconfigured. I thought I would
>>>>>> reach out
>>>>>> to the
>>>>>>      > CloudStack community and ask for some friendly assistance.
>>>>>>      >
>>>>>>      > If there is anything (system-wise) you request in order to
>>>>>> further
>>>>>>      > troubleshoot this issue, please let me know and I'll send. I
>>>>>> appreciate any
>>>>>>      > help in this issue!
>>>>>>      >
>>>>>>      > --
>>>>>>      >
>>>>>>      > Thanks,
>>>>>>      >
>>>>>>      > James
>>>>>>      >
>>>>>>
>>>>>>
>>>>>>
>>>>>> Dag.Sonstebo@shapeblue.com
>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>>>>
>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>>
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>>>> is a framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>>
>>>>
>>>>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>>>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>>>>
>>>> ]<
>>>>
>>>>> http://www.shapeblue.com/>
>>>>>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>>
>>>> Shapeblue - The CloudStack Company
>>>> <https://maps.google.com/?q=lue+-+The+CloudStack+Company&entry=gmail&source=g>
>>>> <http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>>>> is a framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>>
>>>>
>>>>
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>>>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>>
>>>> Shapeblue - The CloudStack Company
>>>> <https://maps.google.com/?q=CloudStack+Company&entry=gmail&source=g><
>>>> http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>>>> is a framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>>
>>>>
>>>>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>>>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>>
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>>>> is a framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>>
>>>>
>>>>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>>>>> is a
>>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>>> standardised ...
>>>>>
>>>>>
>>>>>
>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>>>>>> @shapeblue
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>
>>>>>
>>>>>
>>>>> James McClune
>>>>>
>>>>> Technical Support Specialist
>>>>>
>>>>> Norwalk City Schools
>>>>>
>>>>> Phone: 419-660-6590
>>>>>
>>>>> mcclunej@norwalktruckers.net
>>>>>
>>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> James McClune
>>>>
>>>> Technical Support Specialist
>>>>
>>>> Norwalk City Schools
>>>>
>>>> Phone: 419-660-6590
>>>>
>>>> mcclunej@norwalktruckers.net
>>>>
>>>> rohit.yadav@shapeblue.com
>>>> www.shapeblue.com
>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>>>> @shapeblue
>>>>
>>>>
>>>
>


-- 



James McClune

Technical Support Specialist

Norwalk City Schools

Phone: 419-660-6590

mcclunej@norwalktruckers.net

Re: Problems with KVM HA & STONITH

Posted by victor <vi...@ihnetworks.com>.
Hello Boris,

I am able to create VM with nfs+Ha and nfs without HA. The issue is with 
creating VM with Ceph  storage.

Regards
Victor


On 04/05/2018 01:18 PM, Boris Stoyanov wrote:
> Hi Victor,
> Host HA is working only with KVM + NFS. Ceph is not supported at this stage. Obviously RAW volumes are not supported on your pool, but I’m not sure if that’s because of Ceph or HA in general. Are you able to deploy a non-ha VM?
>
> Boris Stoyanov
>
>
> boris.stoyanov@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>    
>   
>
>> On 5 Apr 2018, at 4:19, victor <vi...@ihnetworks.com> wrote:
>>
>> Hello Rohit,
>>
>> Is the Host HA provider start working with Ceph. The reason I am asking is because, I am not able to create a VM with Ceph storage in a kvm host with HA enabled and I am getting the following error while creating VM.
>>
>> ============
>> .cloud.exception.StorageUnavailableException: Resource [StoragePool:2] is unreachable: Unable to create Vol[9|vm=6|DATADISK]:com.cloud.utils.exception.CloudRuntimeException: org.libvirt.LibvirtException: unsupported configuration: only RAW volumes are supported by this storage pool
>> ============
>>
>> Regards
>> Victor
>>
>> On 11/04/2017 09:53 PM, Rohit Yadav wrote:
>>> Hi James, (/cc Simon and others),
>>>
>>>
>>> A new feature exists in upcoming ACS 4.11, Host HA:
>>>
>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA
>>>
>>> You can read more about it here as well: http://www.shapeblue.com/host-ha-for-kvm-hosts-in-cloudstack/
>>>
>>> This feature can use a custom HA provider, with default HA provider implemented for KVM and NFS, and uses ipmi based fencing (STONITH) of the host. The current HA mechanism provides no such method of fencing (powering off) a host and it depends under what circumstances the VM HA is failing (environment issues, ACS version etc).
>>>
>>> As Simon mentioned, we have a (host) HA provider that works with Ceph in near future.
>>>
>>> Regards.
>>>
>>> ________________________________
>>> From: Simon Weller <sw...@ena.com.INVALID>
>>> Sent: Thursday, November 2, 2017 7:27:22 PM
>>> To: users@cloudstack.apache.org
>>> Subject: Re: Problems with KVM HA & STONITH
>>>
>>> James,
>>>
>>>
>>> Ceph is a great solution and we run all of our ACS storage on Ceph. Note that it adds another layer of complexity to your installation, so you're going need to develop some expertise with that platform to get comfortable with how it works. Typically you don't want to mix Ceph with your ACS hosts. We in fact deploy 3 separate Ceph Monitors, and then scale OSDs as required on a per cluster basis in order to add additional resiliency (So every KVM ACS cluster has it's own Ceph "POD").  We also use Ceph for S3 storage (on completely separate Ceph clusters) for some other services.
>>>
>>>
>>> NFS is much simpler to maintain for smaller installations in my opinion. If the IO load you're looking at isn't going to be insanely high, you could look at building a 2 node NFS cluster using pacemaker and DRDB for data replication between nodes. That would reduce your storage requirement to 2 fairly low power servers (NFS is not very cpu intensive). Currently on a host failure when using a storage other than NFS on KVM, you will not see HA occur until you take the failed host out of the ACS cluster. This is a historical limitation because ACS could not confirm the host had been fenced correctly, so to avoid potential data corruption (due to 2 hosts mounting the same storage), it doesn't do anything until the operator intervenes. As of ACS 4.10, IPMI based fencing is now supported on NFS and we're planning on developing similar support for Ceph.
>>>
>>>
>>> Since you're an school district, I'm more than happy to jump on the phone with you to talk you through these options if you'd like.
>>>
>>>
>>> - Si
>>>
>>>
>>> ________________________________
>>> From: McClune, James <mc...@norwalktruckers.net>
>>> Sent: Thursday, November 2, 2017 8:28 AM
>>> To: users@cloudstack.apache.org
>>> Subject: Re: Problems with KVM HA & STONITH
>>>
>>> Hi Simon,
>>>
>>> Thanks for getting back to me. I created one single NFS share and added it
>>> as primary storage. I think I better understand how the storage works, with
>>> ACS.
>>>
>>> I was able to get HA working with one NFS storage, which is good. However,
>>> is there a way to incorporate multiple NFS storage pools and still have the
>>> HA functionality? I think something like GlusterFS or Ceph (like Ivan and
>>> Dag described) will work better.
>>>
>>> Thank you Simon, Ivan, and Dag for your assistance!
>>> James
>>>
>>> On Wed, Nov 1, 2017 at 10:10 AM, Simon Weller <sw...@ena.com.invalid>
>>> wrote:
>>>
>>>> James,
>>>>
>>>>
>>>> Try just configuring a single NFS server and see if your setup works. If
>>>> you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is
>>>> picking the one you rebooted as the storage for your VMs and when that
>>>> storage goes away (when you bounce the host), all storage for your VMs
>>>> vanishes and ACS tries to reboot your other hosts.
>>>>
>>>>
>>>> Normally in a simple ACS setup, you would have a separate storage server
>>>> that can serve up NFS to all hosts. If a host dies, then a VM would be
>>>> brought up on a spare hosts since all hosts have access to the same storage.
>>>>
>>>> Your other option is to use local storage, but that won't provide HA.
>>>>
>>>>
>>>> - Si
>>>>
>>>>
>>>> ________________________________
>>>> From: McClune, James <mc...@norwalktruckers.net>
>>>> Sent: Monday, October 30, 2017 2:26 PM
>>>> To: users@cloudstack.apache.org
>>>> Subject: Re: Problems with KVM HA & STONITH
>>>>
>>>> Hi Dag,
>>>>
>>>> Thank you for responding back. I am currently running ACS 4.9 on an Ubuntu
>>>> 14.04 VM. I have the three nodes, each having about 1TB of primary storage
>>>> (NFS) and 1TB of secondary storage (NFS). I added each NFS share into ACS.
>>>> All nodes are in a cluster.
>>>>
>>>> Maybe I'm not understanding the setup or misconfigured something. I'm
>>>> trying to setup an HA environment where if one node goes down, running an
>>>> HA marked VM, the VM will start on another host. When I simulate a network
>>>> disconnect or reboot of a host, all of the nodes go down (STONITH?).
>>>>
>>>> I am unsure on how to setup an HA environment, if all the nodes in the
>>>> cluster go down. Any help is much appreciated!
>>>>
>>>> Thanks,
>>>> James
>>>>
>>>> On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <Da...@shapeblue.com>
>>>> wrote:
>>>>
>>>>> Hi James,
>>>>>
>>>>> I think  you possibly have over-configured your KVM hosts. If you use NFS
>>>>> (and no clustered file system like CLVM) then there should be no need to
>>>>> configure STONITH. CloudStack takes care of your HA, so this is not
>>>>> something you offload to the KVM host.
>>>>>
>>>>> (As mentioned the only time I have played with STONITH and CloudStack was
>>>>> for CLVM – and I eventually found it not fit for purpose, too unstable
>>>> and
>>>>> causing too many issues like you describe. Note this was for block
>>>> storage
>>>>> though – not NFS).
>>>>>
>>>>> Regards,
>>>>> Dag Sonstebo
>>>>> Cloud Architect
>>>>> ShapeBlue
>>>>>
>>>>> On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com>
>>>> wrote:
>>>>>      Hi. If the node losts nfs host it reboots (acs agent behaviour). If
>>>> you
>>>>>      really have 3 storages, you'll go clusterwide reboot everytime your
>>>>> host is
>>>>>      down.
>>>>>
>>>>>      28 окт. 2017 г. 3:02 пользователь "Simon Weller"
>>>>> <sw...@ena.com.invalid>
>>>>>      написал:
>>>>>
>>>>>      > Hi James,
>>>>>      >
>>>>>      >
>>>>>      > Can you elaborate a bit further on the storage? You say you're
>>>>> running NFS
>>>>>      > on all 3 nodes, can you explain how it is setup?
>>>>>      >
>>>>>      > Also, what version of ACS are you running?
>>>>>      >
>>>>>      >
>>>>>      > - Si
>>>>>      >
>>>>>      >
>>>>>      >
>>>>>      >
>>>>>      > ________________________________
>>>>>      > From: McClune, James <mc...@norwalktruckers.net>
>>>>>      > Sent: Friday, October 27, 2017 2:21 PM
>>>>>      > To: users@cloudstack.apache.org
>>>>>      > Subject: Problems with KVM HA & STONITH
>>>>>      >
>>>>>      > Hello Apache CloudStack Community,
>>>>>      >
>>>>>      > My setup consists of the following:
>>>>>      >
>>>>>      > - Three nodes (NODE1, NODE2, and NODE3)
>>>>>      > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu 16.04.3,
>>>>> and NODE3
>>>>>      > is running Ubuntu 14.04.5.
>>>>>      > - Management Server (running on separate VM, not in cluster)
>>>>>      >
>>>>>      > The three nodes use KVM as the hypervisor. I also configured
>>>> primary
>>>>> and
>>>>>      > secondary storage on all three of the nodes. I'm using NFS for the
>>>>> primary
>>>>>      > & secondary storage. VM operations work great. Live migration works
>>>>> great.
>>>>>      >
>>>>>      > However, when a host goes down, the HA functionality does not work
>>>>> at all.
>>>>>      > Instead of spinning up the VM on another available host, the down
>>>>> host
>>>>>      > seems to trigger STONITH. When STONITH happens, all hosts in the
>>>>> cluster go
>>>>>      > down. This not only causes no HA, but also downs perfectly good
>>>>> VM's. I
>>>>>      > have read countless articles and documentation related to this
>>>>> issue. I
>>>>>      > still cannot find a viable solution for this issue. I really want
>>>> to
>>>>> use
>>>>>      > Apache CloudStack, but cannot implement this in production when
>>>>> STONITH
>>>>>      > happens.
>>>>>      >
>>>>>      > I think I have something misconfigured. I thought I would reach out
>>>>> to the
>>>>>      > CloudStack community and ask for some friendly assistance.
>>>>>      >
>>>>>      > If there is anything (system-wise) you request in order to further
>>>>>      > troubleshoot this issue, please let me know and I'll send. I
>>>>> appreciate any
>>>>>      > help in this issue!
>>>>>      >
>>>>>      > --
>>>>>      >
>>>>>      > Thanks,
>>>>>      >
>>>>>      > James
>>>>>      >
>>>>>
>>>>>
>>>>>
>>>>> Dag.Sonstebo@shapeblue.com
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>>>
>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>> www.shapeblue.com<http://www.shapeblue.com>
>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>>>
>>>
>>>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>>>
>>> ]<
>>>> http://www.shapeblue.com/>
>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>>>
>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>> www.shapeblue.com<http://www.shapeblue.com>
>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>>>
>>>
>>>
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>>>
>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>> www.shapeblue.com<http://www.shapeblue.com>
>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>>>
>>>
>>>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>>>
>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>> www.shapeblue.com<http://www.shapeblue.com>
>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>>>
>>>
>>>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>>
>>>>
>>>>
>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>>>>> @shapeblue
>>>>>
>>>>>
>>>>>
>>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> James McClune
>>>>
>>>> Technical Support Specialist
>>>>
>>>> Norwalk City Schools
>>>>
>>>> Phone: 419-660-6590
>>>>
>>>> mcclunej@norwalktruckers.net
>>>>
>>>
>>> --
>>>
>>>
>>>
>>> James McClune
>>>
>>> Technical Support Specialist
>>>
>>> Norwalk City Schools
>>>
>>> Phone: 419-660-6590
>>>
>>> mcclunej@norwalktruckers.net
>>>
>>> rohit.yadav@shapeblue.com
>>> www.shapeblue.com
>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>>> @shapeblue
>>>      
>>>


Re: Problems with KVM HA & STONITH

Posted by Boris Stoyanov <bo...@shapeblue.com>.
Hi Victor, 
Host HA is working only with KVM + NFS. Ceph is not supported at this stage. Obviously RAW volumes are not supported on your pool, but I’m not sure if that’s because of Ceph or HA in general. Are you able to deploy a non-ha VM?

Boris Stoyanov


boris.stoyanov@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 

> On 5 Apr 2018, at 4:19, victor <vi...@ihnetworks.com> wrote:
> 
> Hello Rohit,
> 
> Is the Host HA provider start working with Ceph. The reason I am asking is because, I am not able to create a VM with Ceph storage in a kvm host with HA enabled and I am getting the following error while creating VM.
> 
> ============
> .cloud.exception.StorageUnavailableException: Resource [StoragePool:2] is unreachable: Unable to create Vol[9|vm=6|DATADISK]:com.cloud.utils.exception.CloudRuntimeException: org.libvirt.LibvirtException: unsupported configuration: only RAW volumes are supported by this storage pool
> ============
> 
> Regards
> Victor
> 
> On 11/04/2017 09:53 PM, Rohit Yadav wrote:
>> Hi James, (/cc Simon and others),
>> 
>> 
>> A new feature exists in upcoming ACS 4.11, Host HA:
>> 
>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA
>> 
>> You can read more about it here as well: http://www.shapeblue.com/host-ha-for-kvm-hosts-in-cloudstack/
>> 
>> This feature can use a custom HA provider, with default HA provider implemented for KVM and NFS, and uses ipmi based fencing (STONITH) of the host. The current HA mechanism provides no such method of fencing (powering off) a host and it depends under what circumstances the VM HA is failing (environment issues, ACS version etc).
>> 
>> As Simon mentioned, we have a (host) HA provider that works with Ceph in near future.
>> 
>> Regards.
>> 
>> ________________________________
>> From: Simon Weller <sw...@ena.com.INVALID>
>> Sent: Thursday, November 2, 2017 7:27:22 PM
>> To: users@cloudstack.apache.org
>> Subject: Re: Problems with KVM HA & STONITH
>> 
>> James,
>> 
>> 
>> Ceph is a great solution and we run all of our ACS storage on Ceph. Note that it adds another layer of complexity to your installation, so you're going need to develop some expertise with that platform to get comfortable with how it works. Typically you don't want to mix Ceph with your ACS hosts. We in fact deploy 3 separate Ceph Monitors, and then scale OSDs as required on a per cluster basis in order to add additional resiliency (So every KVM ACS cluster has it's own Ceph "POD").  We also use Ceph for S3 storage (on completely separate Ceph clusters) for some other services.
>> 
>> 
>> NFS is much simpler to maintain for smaller installations in my opinion. If the IO load you're looking at isn't going to be insanely high, you could look at building a 2 node NFS cluster using pacemaker and DRDB for data replication between nodes. That would reduce your storage requirement to 2 fairly low power servers (NFS is not very cpu intensive). Currently on a host failure when using a storage other than NFS on KVM, you will not see HA occur until you take the failed host out of the ACS cluster. This is a historical limitation because ACS could not confirm the host had been fenced correctly, so to avoid potential data corruption (due to 2 hosts mounting the same storage), it doesn't do anything until the operator intervenes. As of ACS 4.10, IPMI based fencing is now supported on NFS and we're planning on developing similar support for Ceph.
>> 
>> 
>> Since you're an school district, I'm more than happy to jump on the phone with you to talk you through these options if you'd like.
>> 
>> 
>> - Si
>> 
>> 
>> ________________________________
>> From: McClune, James <mc...@norwalktruckers.net>
>> Sent: Thursday, November 2, 2017 8:28 AM
>> To: users@cloudstack.apache.org
>> Subject: Re: Problems with KVM HA & STONITH
>> 
>> Hi Simon,
>> 
>> Thanks for getting back to me. I created one single NFS share and added it
>> as primary storage. I think I better understand how the storage works, with
>> ACS.
>> 
>> I was able to get HA working with one NFS storage, which is good. However,
>> is there a way to incorporate multiple NFS storage pools and still have the
>> HA functionality? I think something like GlusterFS or Ceph (like Ivan and
>> Dag described) will work better.
>> 
>> Thank you Simon, Ivan, and Dag for your assistance!
>> James
>> 
>> On Wed, Nov 1, 2017 at 10:10 AM, Simon Weller <sw...@ena.com.invalid>
>> wrote:
>> 
>>> James,
>>> 
>>> 
>>> Try just configuring a single NFS server and see if your setup works. If
>>> you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is
>>> picking the one you rebooted as the storage for your VMs and when that
>>> storage goes away (when you bounce the host), all storage for your VMs
>>> vanishes and ACS tries to reboot your other hosts.
>>> 
>>> 
>>> Normally in a simple ACS setup, you would have a separate storage server
>>> that can serve up NFS to all hosts. If a host dies, then a VM would be
>>> brought up on a spare hosts since all hosts have access to the same storage.
>>> 
>>> Your other option is to use local storage, but that won't provide HA.
>>> 
>>> 
>>> - Si
>>> 
>>> 
>>> ________________________________
>>> From: McClune, James <mc...@norwalktruckers.net>
>>> Sent: Monday, October 30, 2017 2:26 PM
>>> To: users@cloudstack.apache.org
>>> Subject: Re: Problems with KVM HA & STONITH
>>> 
>>> Hi Dag,
>>> 
>>> Thank you for responding back. I am currently running ACS 4.9 on an Ubuntu
>>> 14.04 VM. I have the three nodes, each having about 1TB of primary storage
>>> (NFS) and 1TB of secondary storage (NFS). I added each NFS share into ACS.
>>> All nodes are in a cluster.
>>> 
>>> Maybe I'm not understanding the setup or misconfigured something. I'm
>>> trying to setup an HA environment where if one node goes down, running an
>>> HA marked VM, the VM will start on another host. When I simulate a network
>>> disconnect or reboot of a host, all of the nodes go down (STONITH?).
>>> 
>>> I am unsure on how to setup an HA environment, if all the nodes in the
>>> cluster go down. Any help is much appreciated!
>>> 
>>> Thanks,
>>> James
>>> 
>>> On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <Da...@shapeblue.com>
>>> wrote:
>>> 
>>>> Hi James,
>>>> 
>>>> I think  you possibly have over-configured your KVM hosts. If you use NFS
>>>> (and no clustered file system like CLVM) then there should be no need to
>>>> configure STONITH. CloudStack takes care of your HA, so this is not
>>>> something you offload to the KVM host.
>>>> 
>>>> (As mentioned the only time I have played with STONITH and CloudStack was
>>>> for CLVM – and I eventually found it not fit for purpose, too unstable
>>> and
>>>> causing too many issues like you describe. Note this was for block
>>> storage
>>>> though – not NFS).
>>>> 
>>>> Regards,
>>>> Dag Sonstebo
>>>> Cloud Architect
>>>> ShapeBlue
>>>> 
>>>> On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com>
>>> wrote:
>>>>     Hi. If the node losts nfs host it reboots (acs agent behaviour). If
>>> you
>>>>     really have 3 storages, you'll go clusterwide reboot everytime your
>>>> host is
>>>>     down.
>>>> 
>>>>     28 окт. 2017 г. 3:02 пользователь "Simon Weller"
>>>> <sw...@ena.com.invalid>
>>>>     написал:
>>>> 
>>>>     > Hi James,
>>>>     >
>>>>     >
>>>>     > Can you elaborate a bit further on the storage? You say you're
>>>> running NFS
>>>>     > on all 3 nodes, can you explain how it is setup?
>>>>     >
>>>>     > Also, what version of ACS are you running?
>>>>     >
>>>>     >
>>>>     > - Si
>>>>     >
>>>>     >
>>>>     >
>>>>     >
>>>>     > ________________________________
>>>>     > From: McClune, James <mc...@norwalktruckers.net>
>>>>     > Sent: Friday, October 27, 2017 2:21 PM
>>>>     > To: users@cloudstack.apache.org
>>>>     > Subject: Problems with KVM HA & STONITH
>>>>     >
>>>>     > Hello Apache CloudStack Community,
>>>>     >
>>>>     > My setup consists of the following:
>>>>     >
>>>>     > - Three nodes (NODE1, NODE2, and NODE3)
>>>>     > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu 16.04.3,
>>>> and NODE3
>>>>     > is running Ubuntu 14.04.5.
>>>>     > - Management Server (running on separate VM, not in cluster)
>>>>     >
>>>>     > The three nodes use KVM as the hypervisor. I also configured
>>> primary
>>>> and
>>>>     > secondary storage on all three of the nodes. I'm using NFS for the
>>>> primary
>>>>     > & secondary storage. VM operations work great. Live migration works
>>>> great.
>>>>     >
>>>>     > However, when a host goes down, the HA functionality does not work
>>>> at all.
>>>>     > Instead of spinning up the VM on another available host, the down
>>>> host
>>>>     > seems to trigger STONITH. When STONITH happens, all hosts in the
>>>> cluster go
>>>>     > down. This not only causes no HA, but also downs perfectly good
>>>> VM's. I
>>>>     > have read countless articles and documentation related to this
>>>> issue. I
>>>>     > still cannot find a viable solution for this issue. I really want
>>> to
>>>> use
>>>>     > Apache CloudStack, but cannot implement this in production when
>>>> STONITH
>>>>     > happens.
>>>>     >
>>>>     > I think I have something misconfigured. I thought I would reach out
>>>> to the
>>>>     > CloudStack community and ask for some friendly assistance.
>>>>     >
>>>>     > If there is anything (system-wise) you request in order to further
>>>>     > troubleshoot this issue, please let me know and I'll send. I
>>>> appreciate any
>>>>     > help in this issue!
>>>>     >
>>>>     > --
>>>>     >
>>>>     > Thanks,
>>>>     >
>>>>     > James
>>>>     >
>>>> 
>>>> 
>>>> 
>>>> Dag.Sonstebo@shapeblue.com
>>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>> 
>> 
>> 
>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>> 
>> ]<
>>> http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>> 
>> 
>> 
>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>> 
>> 
>> 
>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>> 
>> 
>> 
>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>> standardised ...
>>> 
>>> 
>>> 
>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>>>> @shapeblue
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> 
>>> 
>>> 
>>> James McClune
>>> 
>>> Technical Support Specialist
>>> 
>>> Norwalk City Schools
>>> 
>>> Phone: 419-660-6590
>>> 
>>> mcclunej@norwalktruckers.net
>>> 
>> 
>> 
>> --
>> 
>> 
>> 
>> James McClune
>> 
>> Technical Support Specialist
>> 
>> Norwalk City Schools
>> 
>> Phone: 419-660-6590
>> 
>> mcclunej@norwalktruckers.net
>> 
>> rohit.yadav@shapeblue.com
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>> @shapeblue
>>     
>> 
> 


Re: Problems with KVM HA & STONITH

Posted by victor <vi...@ihnetworks.com>.
Hello Rohit,

Is the Host HA provider start working with Ceph. The reason I am asking 
is because, I am not able to create a VM with Ceph storage in a kvm host 
with HA enabled and I am getting the following error while creating VM.

============
.cloud.exception.StorageUnavailableException: Resource [StoragePool:2] 
is unreachable: Unable to create 
Vol[9|vm=6|DATADISK]:com.cloud.utils.exception.CloudRuntimeException: 
org.libvirt.LibvirtException: unsupported configuration: only RAW 
volumes are supported by this storage pool
============

Regards
Victor

On 11/04/2017 09:53 PM, Rohit Yadav wrote:
> Hi James, (/cc Simon and others),
>
>
> A new feature exists in upcoming ACS 4.11, Host HA:
>
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA
>
> You can read more about it here as well: http://www.shapeblue.com/host-ha-for-kvm-hosts-in-cloudstack/
>
> This feature can use a custom HA provider, with default HA provider implemented for KVM and NFS, and uses ipmi based fencing (STONITH) of the host. The current HA mechanism provides no such method of fencing (powering off) a host and it depends under what circumstances the VM HA is failing (environment issues, ACS version etc).
>
> As Simon mentioned, we have a (host) HA provider that works with Ceph in near future.
>
> Regards.
>
> ________________________________
> From: Simon Weller <sw...@ena.com.INVALID>
> Sent: Thursday, November 2, 2017 7:27:22 PM
> To: users@cloudstack.apache.org
> Subject: Re: Problems with KVM HA & STONITH
>
> James,
>
>
> Ceph is a great solution and we run all of our ACS storage on Ceph. Note that it adds another layer of complexity to your installation, so you're going need to develop some expertise with that platform to get comfortable with how it works. Typically you don't want to mix Ceph with your ACS hosts. We in fact deploy 3 separate Ceph Monitors, and then scale OSDs as required on a per cluster basis in order to add additional resiliency (So every KVM ACS cluster has it's own Ceph "POD").  We also use Ceph for S3 storage (on completely separate Ceph clusters) for some other services.
>
>
> NFS is much simpler to maintain for smaller installations in my opinion. If the IO load you're looking at isn't going to be insanely high, you could look at building a 2 node NFS cluster using pacemaker and DRDB for data replication between nodes. That would reduce your storage requirement to 2 fairly low power servers (NFS is not very cpu intensive). Currently on a host failure when using a storage other than NFS on KVM, you will not see HA occur until you take the failed host out of the ACS cluster. This is a historical limitation because ACS could not confirm the host had been fenced correctly, so to avoid potential data corruption (due to 2 hosts mounting the same storage), it doesn't do anything until the operator intervenes. As of ACS 4.10, IPMI based fencing is now supported on NFS and we're planning on developing similar support for Ceph.
>
>
> Since you're an school district, I'm more than happy to jump on the phone with you to talk you through these options if you'd like.
>
>
> - Si
>
>
> ________________________________
> From: McClune, James <mc...@norwalktruckers.net>
> Sent: Thursday, November 2, 2017 8:28 AM
> To: users@cloudstack.apache.org
> Subject: Re: Problems with KVM HA & STONITH
>
> Hi Simon,
>
> Thanks for getting back to me. I created one single NFS share and added it
> as primary storage. I think I better understand how the storage works, with
> ACS.
>
> I was able to get HA working with one NFS storage, which is good. However,
> is there a way to incorporate multiple NFS storage pools and still have the
> HA functionality? I think something like GlusterFS or Ceph (like Ivan and
> Dag described) will work better.
>
> Thank you Simon, Ivan, and Dag for your assistance!
> James
>
> On Wed, Nov 1, 2017 at 10:10 AM, Simon Weller <sw...@ena.com.invalid>
> wrote:
>
>> James,
>>
>>
>> Try just configuring a single NFS server and see if your setup works. If
>> you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is
>> picking the one you rebooted as the storage for your VMs and when that
>> storage goes away (when you bounce the host), all storage for your VMs
>> vanishes and ACS tries to reboot your other hosts.
>>
>>
>> Normally in a simple ACS setup, you would have a separate storage server
>> that can serve up NFS to all hosts. If a host dies, then a VM would be
>> brought up on a spare hosts since all hosts have access to the same storage.
>>
>> Your other option is to use local storage, but that won't provide HA.
>>
>>
>> - Si
>>
>>
>> ________________________________
>> From: McClune, James <mc...@norwalktruckers.net>
>> Sent: Monday, October 30, 2017 2:26 PM
>> To: users@cloudstack.apache.org
>> Subject: Re: Problems with KVM HA & STONITH
>>
>> Hi Dag,
>>
>> Thank you for responding back. I am currently running ACS 4.9 on an Ubuntu
>> 14.04 VM. I have the three nodes, each having about 1TB of primary storage
>> (NFS) and 1TB of secondary storage (NFS). I added each NFS share into ACS.
>> All nodes are in a cluster.
>>
>> Maybe I'm not understanding the setup or misconfigured something. I'm
>> trying to setup an HA environment where if one node goes down, running an
>> HA marked VM, the VM will start on another host. When I simulate a network
>> disconnect or reboot of a host, all of the nodes go down (STONITH?).
>>
>> I am unsure on how to setup an HA environment, if all the nodes in the
>> cluster go down. Any help is much appreciated!
>>
>> Thanks,
>> James
>>
>> On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <Da...@shapeblue.com>
>> wrote:
>>
>>> Hi James,
>>>
>>> I think  you possibly have over-configured your KVM hosts. If you use NFS
>>> (and no clustered file system like CLVM) then there should be no need to
>>> configure STONITH. CloudStack takes care of your HA, so this is not
>>> something you offload to the KVM host.
>>>
>>> (As mentioned the only time I have played with STONITH and CloudStack was
>>> for CLVM – and I eventually found it not fit for purpose, too unstable
>> and
>>> causing too many issues like you describe. Note this was for block
>> storage
>>> though – not NFS).
>>>
>>> Regards,
>>> Dag Sonstebo
>>> Cloud Architect
>>> ShapeBlue
>>>
>>> On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com>
>> wrote:
>>>      Hi. If the node losts nfs host it reboots (acs agent behaviour). If
>> you
>>>      really have 3 storages, you'll go clusterwide reboot everytime your
>>> host is
>>>      down.
>>>
>>>      28 окт. 2017 г. 3:02 пользователь "Simon Weller"
>>> <sw...@ena.com.invalid>
>>>      написал:
>>>
>>>      > Hi James,
>>>      >
>>>      >
>>>      > Can you elaborate a bit further on the storage? You say you're
>>> running NFS
>>>      > on all 3 nodes, can you explain how it is setup?
>>>      >
>>>      > Also, what version of ACS are you running?
>>>      >
>>>      >
>>>      > - Si
>>>      >
>>>      >
>>>      >
>>>      >
>>>      > ________________________________
>>>      > From: McClune, James <mc...@norwalktruckers.net>
>>>      > Sent: Friday, October 27, 2017 2:21 PM
>>>      > To: users@cloudstack.apache.org
>>>      > Subject: Problems with KVM HA & STONITH
>>>      >
>>>      > Hello Apache CloudStack Community,
>>>      >
>>>      > My setup consists of the following:
>>>      >
>>>      > - Three nodes (NODE1, NODE2, and NODE3)
>>>      > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu 16.04.3,
>>> and NODE3
>>>      > is running Ubuntu 14.04.5.
>>>      > - Management Server (running on separate VM, not in cluster)
>>>      >
>>>      > The three nodes use KVM as the hypervisor. I also configured
>> primary
>>> and
>>>      > secondary storage on all three of the nodes. I'm using NFS for the
>>> primary
>>>      > & secondary storage. VM operations work great. Live migration works
>>> great.
>>>      >
>>>      > However, when a host goes down, the HA functionality does not work
>>> at all.
>>>      > Instead of spinning up the VM on another available host, the down
>>> host
>>>      > seems to trigger STONITH. When STONITH happens, all hosts in the
>>> cluster go
>>>      > down. This not only causes no HA, but also downs perfectly good
>>> VM's. I
>>>      > have read countless articles and documentation related to this
>>> issue. I
>>>      > still cannot find a viable solution for this issue. I really want
>> to
>>> use
>>>      > Apache CloudStack, but cannot implement this in production when
>>> STONITH
>>>      > happens.
>>>      >
>>>      > I think I have something misconfigured. I thought I would reach out
>>> to the
>>>      > CloudStack community and ask for some friendly assistance.
>>>      >
>>>      > If there is anything (system-wise) you request in order to further
>>>      > troubleshoot this issue, please let me know and I'll send. I
>>> appreciate any
>>>      > help in this issue!
>>>      >
>>>      > --
>>>      >
>>>      > Thanks,
>>>      >
>>>      > James
>>>      >
>>>
>>>
>>>
>>> Dag.Sonstebo@shapeblue.com
>>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
>> http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>>
>>
>>
>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>>> @shapeblue
>>>
>>>
>>>
>>>
>>
>> --
>>
>>
>>
>> James McClune
>>
>> Technical Support Specialist
>>
>> Norwalk City Schools
>>
>> Phone: 419-660-6590
>>
>> mcclunej@norwalktruckers.net
>>
>
>
> --
>
>
>
> James McClune
>
> Technical Support Specialist
>
> Norwalk City Schools
>
> Phone: 419-660-6590
>
> mcclunej@norwalktruckers.net
>
> rohit.yadav@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>    
>   
>
>


RE: Problems with KVM HA & STONITH

Posted by Simon Weller <sw...@ena.com.INVALID>.
Yep, very exciting!

Simon Weller/615-312-6068

-----Original Message-----
From: Rohit Yadav [rohit.yadav@shapeblue.com]
Received: Saturday, 04 Nov 2017, 11:23AM
To: users@cloudstack.apache.org [users@cloudstack.apache.org]
Subject: Re: Problems with KVM HA & STONITH

Hi James, (/cc Simon and others),


A new feature exists in upcoming ACS 4.11, Host HA:

https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA

You can read more about it here as well: http://www.shapeblue.com/host-ha-for-kvm-hosts-in-cloudstack/

This feature can use a custom HA provider, with default HA provider implemented for KVM and NFS, and uses ipmi based fencing (STONITH) of the host. The current HA mechanism provides no such method of fencing (powering off) a host and it depends under what circumstances the VM HA is failing (environment issues, ACS version etc).

As Simon mentioned, we have a (host) HA provider that works with Ceph in near future.

Regards.

________________________________
From: Simon Weller <sw...@ena.com.INVALID>
Sent: Thursday, November 2, 2017 7:27:22 PM
To: users@cloudstack.apache.org
Subject: Re: Problems with KVM HA & STONITH

James,


Ceph is a great solution and we run all of our ACS storage on Ceph. Note that it adds another layer of complexity to your installation, so you're going need to develop some expertise with that platform to get comfortable with how it works. Typically you don't want to mix Ceph with your ACS hosts. We in fact deploy 3 separate Ceph Monitors, and then scale OSDs as required on a per cluster basis in order to add additional resiliency (So every KVM ACS cluster has it's own Ceph "POD").  We also use Ceph for S3 storage (on completely separate Ceph clusters) for some other services.


NFS is much simpler to maintain for smaller installations in my opinion. If the IO load you're looking at isn't going to be insanely high, you could look at building a 2 node NFS cluster using pacemaker and DRDB for data replication between nodes. That would reduce your storage requirement to 2 fairly low power servers (NFS is not very cpu intensive). Currently on a host failure when using a storage other than NFS on KVM, you will not see HA occur until you take the failed host out of the ACS cluster. This is a historical limitation because ACS could not confirm the host had been fenced correctly, so to avoid potential data corruption (due to 2 hosts mounting the same storage), it doesn't do anything until the operator intervenes. As of ACS 4.10, IPMI based fencing is now supported on NFS and we're planning on developing similar support for Ceph.


Since you're an school district, I'm more than happy to jump on the phone with you to talk you through these options if you'd like.


- Si


________________________________
From: McClune, James <mc...@norwalktruckers.net>
Sent: Thursday, November 2, 2017 8:28 AM
To: users@cloudstack.apache.org
Subject: Re: Problems with KVM HA & STONITH

Hi Simon,

Thanks for getting back to me. I created one single NFS share and added it
as primary storage. I think I better understand how the storage works, with
ACS.

I was able to get HA working with one NFS storage, which is good. However,
is there a way to incorporate multiple NFS storage pools and still have the
HA functionality? I think something like GlusterFS or Ceph (like Ivan and
Dag described) will work better.

Thank you Simon, Ivan, and Dag for your assistance!
James

On Wed, Nov 1, 2017 at 10:10 AM, Simon Weller <sw...@ena.com.invalid>
wrote:

> James,
>
>
> Try just configuring a single NFS server and see if your setup works. If
> you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is
> picking the one you rebooted as the storage for your VMs and when that
> storage goes away (when you bounce the host), all storage for your VMs
> vanishes and ACS tries to reboot your other hosts.
>
>
> Normally in a simple ACS setup, you would have a separate storage server
> that can serve up NFS to all hosts. If a host dies, then a VM would be
> brought up on a spare hosts since all hosts have access to the same storage.
>
> Your other option is to use local storage, but that won't provide HA.
>
>
> - Si
>
>
> ________________________________
> From: McClune, James <mc...@norwalktruckers.net>
> Sent: Monday, October 30, 2017 2:26 PM
> To: users@cloudstack.apache.org
> Subject: Re: Problems with KVM HA & STONITH
>
> Hi Dag,
>
> Thank you for responding back. I am currently running ACS 4.9 on an Ubuntu
> 14.04 VM. I have the three nodes, each having about 1TB of primary storage
> (NFS) and 1TB of secondary storage (NFS). I added each NFS share into ACS.
> All nodes are in a cluster.
>
> Maybe I'm not understanding the setup or misconfigured something. I'm
> trying to setup an HA environment where if one node goes down, running an
> HA marked VM, the VM will start on another host. When I simulate a network
> disconnect or reboot of a host, all of the nodes go down (STONITH?).
>
> I am unsure on how to setup an HA environment, if all the nodes in the
> cluster go down. Any help is much appreciated!
>
> Thanks,
> James
>
> On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <Da...@shapeblue.com>
> wrote:
>
> > Hi James,
> >
> > I think  you possibly have over-configured your KVM hosts. If you use NFS
> > (and no clustered file system like CLVM) then there should be no need to
> > configure STONITH. CloudStack takes care of your HA, so this is not
> > something you offload to the KVM host.
> >
> > (As mentioned the only time I have played with STONITH and CloudStack was
> > for CLVM – and I eventually found it not fit for purpose, too unstable
> and
> > causing too many issues like you describe. Note this was for block
> storage
> > though – not NFS).
> >
> > Regards,
> > Dag Sonstebo
> > Cloud Architect
> > ShapeBlue
> >
> > On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com>
> wrote:
> >
> >     Hi. If the node losts nfs host it reboots (acs agent behaviour). If
> you
> >     really have 3 storages, you'll go clusterwide reboot everytime your
> > host is
> >     down.
> >
> >     28 окт. 2017 г. 3:02 пользователь "Simon Weller"
> > <sw...@ena.com.invalid>
> >     написал:
> >
> >     > Hi James,
> >     >
> >     >
> >     > Can you elaborate a bit further on the storage? You say you're
> > running NFS
> >     > on all 3 nodes, can you explain how it is setup?
> >     >
> >     > Also, what version of ACS are you running?
> >     >
> >     >
> >     > - Si
> >     >
> >     >
> >     >
> >     >
> >     > ________________________________
> >     > From: McClune, James <mc...@norwalktruckers.net>
> >     > Sent: Friday, October 27, 2017 2:21 PM
> >     > To: users@cloudstack.apache.org
> >     > Subject: Problems with KVM HA & STONITH
> >     >
> >     > Hello Apache CloudStack Community,
> >     >
> >     > My setup consists of the following:
> >     >
> >     > - Three nodes (NODE1, NODE2, and NODE3)
> >     > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu 16.04.3,
> > and NODE3
> >     > is running Ubuntu 14.04.5.
> >     > - Management Server (running on separate VM, not in cluster)
> >     >
> >     > The three nodes use KVM as the hypervisor. I also configured
> primary
> > and
> >     > secondary storage on all three of the nodes. I'm using NFS for the
> > primary
> >     > & secondary storage. VM operations work great. Live migration works
> > great.
> >     >
> >     > However, when a host goes down, the HA functionality does not work
> > at all.
> >     > Instead of spinning up the VM on another available host, the down
> > host
> >     > seems to trigger STONITH. When STONITH happens, all hosts in the
> > cluster go
> >     > down. This not only causes no HA, but also downs perfectly good
> > VM's. I
> >     > have read countless articles and documentation related to this
> > issue. I
> >     > still cannot find a viable solution for this issue. I really want
> to
> > use
> >     > Apache CloudStack, but cannot implement this in production when
> > STONITH
> >     > happens.
> >     >
> >     > I think I have something misconfigured. I thought I would reach out
> > to the
> >     > CloudStack community and ask for some friendly assistance.
> >     >
> >     > If there is anything (system-wise) you request in order to further
> >     > troubleshoot this issue, please let me know and I'll send. I
> > appreciate any
> >     > help in this issue!
> >     >
> >     > --
> >     >
> >     > Thanks,
> >     >
> >     > James
> >     >
> >
> >
> >
> > Dag.Sonstebo@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
>
>
> --
>
>
>
> James McClune
>
> Technical Support Specialist
>
> Norwalk City Schools
>
> Phone: 419-660-6590
>
> mcclunej@norwalktruckers.net
>



--



James McClune

Technical Support Specialist

Norwalk City Schools

Phone: 419-660-6590

mcclunej@norwalktruckers.net

rohit.yadav@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




Re: Problems with KVM HA & STONITH

Posted by Rohit Yadav <ro...@shapeblue.com>.
Hi James, (/cc Simon and others),


A new feature exists in upcoming ACS 4.11, Host HA:

https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA

You can read more about it here as well: http://www.shapeblue.com/host-ha-for-kvm-hosts-in-cloudstack/

This feature can use a custom HA provider, with default HA provider implemented for KVM and NFS, and uses ipmi based fencing (STONITH) of the host. The current HA mechanism provides no such method of fencing (powering off) a host and it depends under what circumstances the VM HA is failing (environment issues, ACS version etc).

As Simon mentioned, we have a (host) HA provider that works with Ceph in near future.

Regards.

________________________________
From: Simon Weller <sw...@ena.com.INVALID>
Sent: Thursday, November 2, 2017 7:27:22 PM
To: users@cloudstack.apache.org
Subject: Re: Problems with KVM HA & STONITH

James,


Ceph is a great solution and we run all of our ACS storage on Ceph. Note that it adds another layer of complexity to your installation, so you're going need to develop some expertise with that platform to get comfortable with how it works. Typically you don't want to mix Ceph with your ACS hosts. We in fact deploy 3 separate Ceph Monitors, and then scale OSDs as required on a per cluster basis in order to add additional resiliency (So every KVM ACS cluster has it's own Ceph "POD").  We also use Ceph for S3 storage (on completely separate Ceph clusters) for some other services.


NFS is much simpler to maintain for smaller installations in my opinion. If the IO load you're looking at isn't going to be insanely high, you could look at building a 2 node NFS cluster using pacemaker and DRDB for data replication between nodes. That would reduce your storage requirement to 2 fairly low power servers (NFS is not very cpu intensive). Currently on a host failure when using a storage other than NFS on KVM, you will not see HA occur until you take the failed host out of the ACS cluster. This is a historical limitation because ACS could not confirm the host had been fenced correctly, so to avoid potential data corruption (due to 2 hosts mounting the same storage), it doesn't do anything until the operator intervenes. As of ACS 4.10, IPMI based fencing is now supported on NFS and we're planning on developing similar support for Ceph.


Since you're an school district, I'm more than happy to jump on the phone with you to talk you through these options if you'd like.


- Si


________________________________
From: McClune, James <mc...@norwalktruckers.net>
Sent: Thursday, November 2, 2017 8:28 AM
To: users@cloudstack.apache.org
Subject: Re: Problems with KVM HA & STONITH

Hi Simon,

Thanks for getting back to me. I created one single NFS share and added it
as primary storage. I think I better understand how the storage works, with
ACS.

I was able to get HA working with one NFS storage, which is good. However,
is there a way to incorporate multiple NFS storage pools and still have the
HA functionality? I think something like GlusterFS or Ceph (like Ivan and
Dag described) will work better.

Thank you Simon, Ivan, and Dag for your assistance!
James

On Wed, Nov 1, 2017 at 10:10 AM, Simon Weller <sw...@ena.com.invalid>
wrote:

> James,
>
>
> Try just configuring a single NFS server and see if your setup works. If
> you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is
> picking the one you rebooted as the storage for your VMs and when that
> storage goes away (when you bounce the host), all storage for your VMs
> vanishes and ACS tries to reboot your other hosts.
>
>
> Normally in a simple ACS setup, you would have a separate storage server
> that can serve up NFS to all hosts. If a host dies, then a VM would be
> brought up on a spare hosts since all hosts have access to the same storage.
>
> Your other option is to use local storage, but that won't provide HA.
>
>
> - Si
>
>
> ________________________________
> From: McClune, James <mc...@norwalktruckers.net>
> Sent: Monday, October 30, 2017 2:26 PM
> To: users@cloudstack.apache.org
> Subject: Re: Problems with KVM HA & STONITH
>
> Hi Dag,
>
> Thank you for responding back. I am currently running ACS 4.9 on an Ubuntu
> 14.04 VM. I have the three nodes, each having about 1TB of primary storage
> (NFS) and 1TB of secondary storage (NFS). I added each NFS share into ACS.
> All nodes are in a cluster.
>
> Maybe I'm not understanding the setup or misconfigured something. I'm
> trying to setup an HA environment where if one node goes down, running an
> HA marked VM, the VM will start on another host. When I simulate a network
> disconnect or reboot of a host, all of the nodes go down (STONITH?).
>
> I am unsure on how to setup an HA environment, if all the nodes in the
> cluster go down. Any help is much appreciated!
>
> Thanks,
> James
>
> On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <Da...@shapeblue.com>
> wrote:
>
> > Hi James,
> >
> > I think  you possibly have over-configured your KVM hosts. If you use NFS
> > (and no clustered file system like CLVM) then there should be no need to
> > configure STONITH. CloudStack takes care of your HA, so this is not
> > something you offload to the KVM host.
> >
> > (As mentioned the only time I have played with STONITH and CloudStack was
> > for CLVM – and I eventually found it not fit for purpose, too unstable
> and
> > causing too many issues like you describe. Note this was for block
> storage
> > though – not NFS).
> >
> > Regards,
> > Dag Sonstebo
> > Cloud Architect
> > ShapeBlue
> >
> > On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com>
> wrote:
> >
> >     Hi. If the node losts nfs host it reboots (acs agent behaviour). If
> you
> >     really have 3 storages, you'll go clusterwide reboot everytime your
> > host is
> >     down.
> >
> >     28 окт. 2017 г. 3:02 пользователь "Simon Weller"
> > <sw...@ena.com.invalid>
> >     написал:
> >
> >     > Hi James,
> >     >
> >     >
> >     > Can you elaborate a bit further on the storage? You say you're
> > running NFS
> >     > on all 3 nodes, can you explain how it is setup?
> >     >
> >     > Also, what version of ACS are you running?
> >     >
> >     >
> >     > - Si
> >     >
> >     >
> >     >
> >     >
> >     > ________________________________
> >     > From: McClune, James <mc...@norwalktruckers.net>
> >     > Sent: Friday, October 27, 2017 2:21 PM
> >     > To: users@cloudstack.apache.org
> >     > Subject: Problems with KVM HA & STONITH
> >     >
> >     > Hello Apache CloudStack Community,
> >     >
> >     > My setup consists of the following:
> >     >
> >     > - Three nodes (NODE1, NODE2, and NODE3)
> >     > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu 16.04.3,
> > and NODE3
> >     > is running Ubuntu 14.04.5.
> >     > - Management Server (running on separate VM, not in cluster)
> >     >
> >     > The three nodes use KVM as the hypervisor. I also configured
> primary
> > and
> >     > secondary storage on all three of the nodes. I'm using NFS for the
> > primary
> >     > & secondary storage. VM operations work great. Live migration works
> > great.
> >     >
> >     > However, when a host goes down, the HA functionality does not work
> > at all.
> >     > Instead of spinning up the VM on another available host, the down
> > host
> >     > seems to trigger STONITH. When STONITH happens, all hosts in the
> > cluster go
> >     > down. This not only causes no HA, but also downs perfectly good
> > VM's. I
> >     > have read countless articles and documentation related to this
> > issue. I
> >     > still cannot find a viable solution for this issue. I really want
> to
> > use
> >     > Apache CloudStack, but cannot implement this in production when
> > STONITH
> >     > happens.
> >     >
> >     > I think I have something misconfigured. I thought I would reach out
> > to the
> >     > CloudStack community and ask for some friendly assistance.
> >     >
> >     > If there is anything (system-wise) you request in order to further
> >     > troubleshoot this issue, please let me know and I'll send. I
> > appreciate any
> >     > help in this issue!
> >     >
> >     > --
> >     >
> >     > Thanks,
> >     >
> >     > James
> >     >
> >
> >
> >
> > Dag.Sonstebo@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
>
>
> --
>
>
>
> James McClune
>
> Technical Support Specialist
>
> Norwalk City Schools
>
> Phone: 419-660-6590
>
> mcclunej@norwalktruckers.net
>



--



James McClune

Technical Support Specialist

Norwalk City Schools

Phone: 419-660-6590

mcclunej@norwalktruckers.net

rohit.yadav@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


Re: Problems with KVM HA & STONITH

Posted by Simon Weller <sw...@ena.com.INVALID>.
James,


Ceph is a great solution and we run all of our ACS storage on Ceph. Note that it adds another layer of complexity to your installation, so you're going need to develop some expertise with that platform to get comfortable with how it works. Typically you don't want to mix Ceph with your ACS hosts. We in fact deploy 3 separate Ceph Monitors, and then scale OSDs as required on a per cluster basis in order to add additional resiliency (So every KVM ACS cluster has it's own Ceph "POD").  We also use Ceph for S3 storage (on completely separate Ceph clusters) for some other services.


NFS is much simpler to maintain for smaller installations in my opinion. If the IO load you're looking at isn't going to be insanely high, you could look at building a 2 node NFS cluster using pacemaker and DRDB for data replication between nodes. That would reduce your storage requirement to 2 fairly low power servers (NFS is not very cpu intensive). Currently on a host failure when using a storage other than NFS on KVM, you will not see HA occur until you take the failed host out of the ACS cluster. This is a historical limitation because ACS could not confirm the host had been fenced correctly, so to avoid potential data corruption (due to 2 hosts mounting the same storage), it doesn't do anything until the operator intervenes. As of ACS 4.10, IPMI based fencing is now supported on NFS and we're planning on developing similar support for Ceph.


Since you're an school district, I'm more than happy to jump on the phone with you to talk you through these options if you'd like.


- Si


________________________________
From: McClune, James <mc...@norwalktruckers.net>
Sent: Thursday, November 2, 2017 8:28 AM
To: users@cloudstack.apache.org
Subject: Re: Problems with KVM HA & STONITH

Hi Simon,

Thanks for getting back to me. I created one single NFS share and added it
as primary storage. I think I better understand how the storage works, with
ACS.

I was able to get HA working with one NFS storage, which is good. However,
is there a way to incorporate multiple NFS storage pools and still have the
HA functionality? I think something like GlusterFS or Ceph (like Ivan and
Dag described) will work better.

Thank you Simon, Ivan, and Dag for your assistance!
James

On Wed, Nov 1, 2017 at 10:10 AM, Simon Weller <sw...@ena.com.invalid>
wrote:

> James,
>
>
> Try just configuring a single NFS server and see if your setup works. If
> you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is
> picking the one you rebooted as the storage for your VMs and when that
> storage goes away (when you bounce the host), all storage for your VMs
> vanishes and ACS tries to reboot your other hosts.
>
>
> Normally in a simple ACS setup, you would have a separate storage server
> that can serve up NFS to all hosts. If a host dies, then a VM would be
> brought up on a spare hosts since all hosts have access to the same storage.
>
> Your other option is to use local storage, but that won't provide HA.
>
>
> - Si
>
>
> ________________________________
> From: McClune, James <mc...@norwalktruckers.net>
> Sent: Monday, October 30, 2017 2:26 PM
> To: users@cloudstack.apache.org
> Subject: Re: Problems with KVM HA & STONITH
>
> Hi Dag,
>
> Thank you for responding back. I am currently running ACS 4.9 on an Ubuntu
> 14.04 VM. I have the three nodes, each having about 1TB of primary storage
> (NFS) and 1TB of secondary storage (NFS). I added each NFS share into ACS.
> All nodes are in a cluster.
>
> Maybe I'm not understanding the setup or misconfigured something. I'm
> trying to setup an HA environment where if one node goes down, running an
> HA marked VM, the VM will start on another host. When I simulate a network
> disconnect or reboot of a host, all of the nodes go down (STONITH?).
>
> I am unsure on how to setup an HA environment, if all the nodes in the
> cluster go down. Any help is much appreciated!
>
> Thanks,
> James
>
> On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <Da...@shapeblue.com>
> wrote:
>
> > Hi James,
> >
> > I think  you possibly have over-configured your KVM hosts. If you use NFS
> > (and no clustered file system like CLVM) then there should be no need to
> > configure STONITH. CloudStack takes care of your HA, so this is not
> > something you offload to the KVM host.
> >
> > (As mentioned the only time I have played with STONITH and CloudStack was
> > for CLVM – and I eventually found it not fit for purpose, too unstable
> and
> > causing too many issues like you describe. Note this was for block
> storage
> > though – not NFS).
> >
> > Regards,
> > Dag Sonstebo
> > Cloud Architect
> > ShapeBlue
> >
> > On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com>
> wrote:
> >
> >     Hi. If the node losts nfs host it reboots (acs agent behaviour). If
> you
> >     really have 3 storages, you'll go clusterwide reboot everytime your
> > host is
> >     down.
> >
> >     28 окт. 2017 г. 3:02 пользователь "Simon Weller"
> > <sw...@ena.com.invalid>
> >     написал:
> >
> >     > Hi James,
> >     >
> >     >
> >     > Can you elaborate a bit further on the storage? You say you're
> > running NFS
> >     > on all 3 nodes, can you explain how it is setup?
> >     >
> >     > Also, what version of ACS are you running?
> >     >
> >     >
> >     > - Si
> >     >
> >     >
> >     >
> >     >
> >     > ________________________________
> >     > From: McClune, James <mc...@norwalktruckers.net>
> >     > Sent: Friday, October 27, 2017 2:21 PM
> >     > To: users@cloudstack.apache.org
> >     > Subject: Problems with KVM HA & STONITH
> >     >
> >     > Hello Apache CloudStack Community,
> >     >
> >     > My setup consists of the following:
> >     >
> >     > - Three nodes (NODE1, NODE2, and NODE3)
> >     > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu 16.04.3,
> > and NODE3
> >     > is running Ubuntu 14.04.5.
> >     > - Management Server (running on separate VM, not in cluster)
> >     >
> >     > The three nodes use KVM as the hypervisor. I also configured
> primary
> > and
> >     > secondary storage on all three of the nodes. I'm using NFS for the
> > primary
> >     > & secondary storage. VM operations work great. Live migration works
> > great.
> >     >
> >     > However, when a host goes down, the HA functionality does not work
> > at all.
> >     > Instead of spinning up the VM on another available host, the down
> > host
> >     > seems to trigger STONITH. When STONITH happens, all hosts in the
> > cluster go
> >     > down. This not only causes no HA, but also downs perfectly good
> > VM's. I
> >     > have read countless articles and documentation related to this
> > issue. I
> >     > still cannot find a viable solution for this issue. I really want
> to
> > use
> >     > Apache CloudStack, but cannot implement this in production when
> > STONITH
> >     > happens.
> >     >
> >     > I think I have something misconfigured. I thought I would reach out
> > to the
> >     > CloudStack community and ask for some friendly assistance.
> >     >
> >     > If there is anything (system-wise) you request in order to further
> >     > troubleshoot this issue, please let me know and I'll send. I
> > appreciate any
> >     > help in this issue!
> >     >
> >     > --
> >     >
> >     > Thanks,
> >     >
> >     > James
> >     >
> >
> >
> >
> > Dag.Sonstebo@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
>
>
> --
>
>
>
> James McClune
>
> Technical Support Specialist
>
> Norwalk City Schools
>
> Phone: 419-660-6590
>
> mcclunej@norwalktruckers.net
>



--



James McClune

Technical Support Specialist

Norwalk City Schools

Phone: 419-660-6590

mcclunej@norwalktruckers.net

Re: Problems with KVM HA & STONITH

Posted by "McClune, James" <mc...@norwalktruckers.net>.
Hi Simon,

Thanks for getting back to me. I created one single NFS share and added it
as primary storage. I think I better understand how the storage works, with
ACS.

I was able to get HA working with one NFS storage, which is good. However,
is there a way to incorporate multiple NFS storage pools and still have the
HA functionality? I think something like GlusterFS or Ceph (like Ivan and
Dag described) will work better.

Thank you Simon, Ivan, and Dag for your assistance!
James

On Wed, Nov 1, 2017 at 10:10 AM, Simon Weller <sw...@ena.com.invalid>
wrote:

> James,
>
>
> Try just configuring a single NFS server and see if your setup works. If
> you have 3 NFS shares, across all 3 hosts, i'm wondering whether ACS is
> picking the one you rebooted as the storage for your VMs and when that
> storage goes away (when you bounce the host), all storage for your VMs
> vanishes and ACS tries to reboot your other hosts.
>
>
> Normally in a simple ACS setup, you would have a separate storage server
> that can serve up NFS to all hosts. If a host dies, then a VM would be
> brought up on a spare hosts since all hosts have access to the same storage.
>
> Your other option is to use local storage, but that won't provide HA.
>
>
> - Si
>
>
> ________________________________
> From: McClune, James <mc...@norwalktruckers.net>
> Sent: Monday, October 30, 2017 2:26 PM
> To: users@cloudstack.apache.org
> Subject: Re: Problems with KVM HA & STONITH
>
> Hi Dag,
>
> Thank you for responding back. I am currently running ACS 4.9 on an Ubuntu
> 14.04 VM. I have the three nodes, each having about 1TB of primary storage
> (NFS) and 1TB of secondary storage (NFS). I added each NFS share into ACS.
> All nodes are in a cluster.
>
> Maybe I'm not understanding the setup or misconfigured something. I'm
> trying to setup an HA environment where if one node goes down, running an
> HA marked VM, the VM will start on another host. When I simulate a network
> disconnect or reboot of a host, all of the nodes go down (STONITH?).
>
> I am unsure on how to setup an HA environment, if all the nodes in the
> cluster go down. Any help is much appreciated!
>
> Thanks,
> James
>
> On Mon, Oct 30, 2017 at 3:49 AM, Dag Sonstebo <Da...@shapeblue.com>
> wrote:
>
> > Hi James,
> >
> > I think  you possibly have over-configured your KVM hosts. If you use NFS
> > (and no clustered file system like CLVM) then there should be no need to
> > configure STONITH. CloudStack takes care of your HA, so this is not
> > something you offload to the KVM host.
> >
> > (As mentioned the only time I have played with STONITH and CloudStack was
> > for CLVM – and I eventually found it not fit for purpose, too unstable
> and
> > causing too many issues like you describe. Note this was for block
> storage
> > though – not NFS).
> >
> > Regards,
> > Dag Sonstebo
> > Cloud Architect
> > ShapeBlue
> >
> > On 28/10/2017, 03:40, "Ivan Kudryavtsev" <ku...@bw-sw.com>
> wrote:
> >
> >     Hi. If the node losts nfs host it reboots (acs agent behaviour). If
> you
> >     really have 3 storages, you'll go clusterwide reboot everytime your
> > host is
> >     down.
> >
> >     28 окт. 2017 г. 3:02 пользователь "Simon Weller"
> > <sw...@ena.com.invalid>
> >     написал:
> >
> >     > Hi James,
> >     >
> >     >
> >     > Can you elaborate a bit further on the storage? You say you're
> > running NFS
> >     > on all 3 nodes, can you explain how it is setup?
> >     >
> >     > Also, what version of ACS are you running?
> >     >
> >     >
> >     > - Si
> >     >
> >     >
> >     >
> >     >
> >     > ________________________________
> >     > From: McClune, James <mc...@norwalktruckers.net>
> >     > Sent: Friday, October 27, 2017 2:21 PM
> >     > To: users@cloudstack.apache.org
> >     > Subject: Problems with KVM HA & STONITH
> >     >
> >     > Hello Apache CloudStack Community,
> >     >
> >     > My setup consists of the following:
> >     >
> >     > - Three nodes (NODE1, NODE2, and NODE3)
> >     > NODE1 is running Ubuntu 16.04.3, NODE2 is running Ubuntu 16.04.3,
> > and NODE3
> >     > is running Ubuntu 14.04.5.
> >     > - Management Server (running on separate VM, not in cluster)
> >     >
> >     > The three nodes use KVM as the hypervisor. I also configured
> primary
> > and
> >     > secondary storage on all three of the nodes. I'm using NFS for the
> > primary
> >     > & secondary storage. VM operations work great. Live migration works
> > great.
> >     >
> >     > However, when a host goes down, the HA functionality does not work
> > at all.
> >     > Instead of spinning up the VM on another available host, the down
> > host
> >     > seems to trigger STONITH. When STONITH happens, all hosts in the
> > cluster go
> >     > down. This not only causes no HA, but also downs perfectly good
> > VM's. I
> >     > have read countless articles and documentation related to this
> > issue. I
> >     > still cannot find a viable solution for this issue. I really want
> to
> > use
> >     > Apache CloudStack, but cannot implement this in production when
> > STONITH
> >     > happens.
> >     >
> >     > I think I have something misconfigured. I thought I would reach out
> > to the
> >     > CloudStack community and ask for some friendly assistance.
> >     >
> >     > If there is anything (system-wise) you request in order to further
> >     > troubleshoot this issue, please let me know and I'll send. I
> > appreciate any
> >     > help in this issue!
> >     >
> >     > --
> >     >
> >     > Thanks,
> >     >
> >     > James
> >     >
> >
> >
> >
> > Dag.Sonstebo@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
>
>
> --
>
>
>
> James McClune
>
> Technical Support Specialist
>
> Norwalk City Schools
>
> Phone: 419-660-6590
>
> mcclunej@norwalktruckers.net
>



-- 



James McClune

Technical Support Specialist

Norwalk City Schools

Phone: 419-660-6590

mcclunej@norwalktruckers.net