You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@gump.apache.org by Sander Temme <sa...@temme.net> on 2006/05/15 01:20:50 UTC
The Status of Clarus
Folks,
You may have noticed (or not) that Clarus has not been doing its Gump
runs for a week or two. The issue was that both of the drives that
make up the RAID-1 Gump sits on suddenly went out of commission,
without any notice or warning. This is not supposed to happen, and is
exactly the reason those drives are mirrored. However, when I visited
the colocation facility last week, I shut the box down, pulled and re-
seated these drives and they are now once again available. The fact
that they can up and disappear like this is kind of scary, but I'm
glad they are not actually broken.
So, Gump runs are now back on Clarus, running at the same times as on
vmgump except using gump/trunk.
Results as always available at http://clarus.apache.org/
S.
--
sander@temme.net http://www.temme.net/sander/
PGP FP: 51B4 8727 466A 0BC3 69F4 B7B8 B2BE BC40 1529 24AF
Re: The Status of Clarus
Posted by Steve Loughran <st...@apache.org>.
Sander Temme wrote:
> Folks,
>
> You may have noticed (or not) that Clarus has not been doing its Gump
> runs for a week or two. The issue was that both of the drives that make
> up the RAID-1 Gump sits on suddenly went out of commission, without any
> notice or warning. This is not supposed to happen, and is exactly the
> reason those drives are mirrored.
we call this "Raid minus one", in which you think your disks are
mirrored, but they arent. It is actually a worse state than raid-0, "no
raid stuff at all", because at least there you know your data is
vulnerable.
> However, when I visited the colocation
> facility last week, I shut the box down, pulled and re-seated these
> drives and they are now once again available. The fact that they can up
> and disappear like this is kind of scary, but I'm glad they are not
> actually broken.
This is one of this things that are really hard to test.
I've seen SCSI controllers take down drives that were taking too long to
respond; sometimes this can be a transient event, or it can be a
precursor of trouble to come. It could also be the raid controller that
is failing too -they have their own MTBF, see.
> So, Gump runs are now back on Clarus, running at the same times as on
> vmgump except using gump/trunk.
>
> Results as always available at http://clarus.apache.org/
>
> S.
>
> --sander@temme.net http://www.temme.net/sander/
> PGP FP: 51B4 8727 466A 0BC3 69F4 B7B8 B2BE BC40 1529 24AF
>
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@gump.apache.org
For additional commands, e-mail: general-help@gump.apache.org