You are viewing a plain text version of this content. The canonical link for it is here.
Posted to builds@apache.org by Lance Albertson <la...@osuosl.org> on 2022/07/31 01:23:11 UTC

[Hosting] UNPLANNED: Ganeti hypervisor node reboot

All,

One of our production Ganeti nodes (gprod3) decided to reboot on its own
for some reason. All of the VMs should be back online but you might double
check that your services are running properly. There was a thundering herd
problem after the node booted where most of the VMs decided to do a file
system check and caused high I/O. I had a few VMs had problems with
services such as polkit timing out causing other issues on the VMs. A
reboot of those VMs seems to have fixed the issue.

The list of affected VMs are the following:

    - chiral.oftc.net
    - civicrm.osm.osuosl.org
    - lf-bugs.osuosl.org
    - lf-lists.osuosl.org
    - mageiavm.osuosl.org
    - ntpsec-service3.osuosl.org
    - osu1php.osuosl.org
    - web3.osuosl.org
    - www1.phpbb.com

If you have any other issues related to this, please send an email to
support.

Thanks-

-- 
Lance Albertson
Director
Oregon State University | Open Source Lab

Re: [Hosting] UNPLANNED: Ganeti hypervisor node reboot

Posted by Lance Albertson <la...@osuosl.org>.
We just had the same thing happen to another node (gprod1). All of the VMs
are back online now. List of affected VMs are the following:

    - apereo1.osuosl.org
    - app2.osuosl.org
    - area51-1.phpbb.com
    - buildroot-sources.osuosl.org
    - deluge.osuosl.org
    - jenkins-radish.osuosl.org
    - ldap1.ntf.osuosl.org
    - mandrivausers2.osuosl.org
    - scripts.phpbb.com
    - snowdrift-app.osuosl.org
    - snowdrift-smtp.osuosl.org

On Sat, Jul 30, 2022 at 6:23 PM Lance Albertson <la...@osuosl.org> wrote:

> All,
>
> One of our production Ganeti nodes (gprod3) decided to reboot on its own
> for some reason. All of the VMs should be back online but you might double
> check that your services are running properly. There was a thundering herd
> problem after the node booted where most of the VMs decided to do a file
> system check and caused high I/O. I had a few VMs had problems with
> services such as polkit timing out causing other issues on the VMs. A
> reboot of those VMs seems to have fixed the issue.
>
> The list of affected VMs are the following:
>
>     - chiral.oftc.net
>     - civicrm.osm.osuosl.org
>     - lf-bugs.osuosl.org
>     - lf-lists.osuosl.org
>     - mageiavm.osuosl.org
>     - ntpsec-service3.osuosl.org
>     - osu1php.osuosl.org
>     - web3.osuosl.org
>     - www1.phpbb.com
>
> If you have any other issues related to this, please send an email to
> support.
>
> Thanks-
>
> --
> Lance Albertson
> Director
> Oregon State University | Open Source Lab
>


-- 
Lance Albertson
Director
Oregon State University | Open Source Lab