You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by shane knapp <sk...@berkeley.edu> on 2015/09/19 09:28:56 UTC

BUILD SYSTEM: fire and power event at UC berkeley's IST colo, jenkins offline

TL; DR:  jenkins is currently down and will probably not be brought
back up until monday morning.

a machine caught fire in the colo this evening, and this tripped the
halon, and now IST is overheating...  it looks like it may have been
one of our servers that popped and caused the event, and thankfully no
one was hurt.

http://ucbsystems.org/

amplab jenkins is currently down.  some ot her university services are
also down as well.

jon is currently at the colo unplugging the remaining machines of the
type that caught fire and we've reached out to the vendor who supplied
them to see about an investigation.

IST staff will be starting their investigation tomorrow morning, and
jon or i will post some updates as soon as we get them.

sorry for the inconvenience,

shane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: BUILD SYSTEM: fire and power event at UC berkeley's IST colo, jenkins offline

Posted by Reynold Xin <rx...@databricks.com>.
Great!

Jon / Shane: Thanks for handling this.

On Saturday, September 19, 2015, shane knapp <sk...@berkeley.edu> wrote:

> we're up and building!  time for breakfast...  :)
>
> https://amplab.cs.berkeley.edu/jenkins/
>
> On Sat, Sep 19, 2015 at 7:35 AM, shane knapp <sknapp@berkeley.edu
> <javascript:;>> wrote:
> > it was definitely one of our servers...  we have no ETA on when
> > jenkins will be back online.  we will need to inspect the rack closely
> > before we plug in and turn everything on.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org <javascript:;>
> For additional commands, e-mail: dev-help@spark.apache.org <javascript:;>
>
>

Re: BUILD SYSTEM: fire and power event at UC berkeley's IST colo, jenkins offline

Posted by shane knapp <sk...@berkeley.edu>.
we're up and building!  time for breakfast...  :)

https://amplab.cs.berkeley.edu/jenkins/

On Sat, Sep 19, 2015 at 7:35 AM, shane knapp <sk...@berkeley.edu> wrote:
> it was definitely one of our servers...  we have no ETA on when
> jenkins will be back online.  we will need to inspect the rack closely
> before we plug in and turn everything on.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: BUILD SYSTEM: fire and power event at UC berkeley's IST colo, jenkins offline

Posted by shane knapp <sk...@berkeley.edu>.
it was definitely one of our servers...  we have no ETA on when
jenkins will be back online.  we will need to inspect the rack closely
before we plug in and turn everything on.

Re: BUILD SYSTEM: fire and power event at UC berkeley's IST colo, jenkins offline

Posted by Steve Loughran <st...@hortonworks.com>.
> On 19 Sep 2015, at 08:28, shane knapp <sk...@berkeley.edu> wrote:
> 
> TL; DR:  jenkins is currently down and will probably not be brought
> back up until monday morning.
> 
> a machine caught fire in the colo this evening, and this tripped the
> halon, and now IST is overheating...  it looks like it may have been
> one of our servers that popped and caused the event, and thankfully no
> one was hurt.
> 
> http://ucbsystems.org/
> 
> amplab jenkins is currently down.  some ot her university services are
> also down as well.
> 
> jon is currently at the colo unplugging the remaining machines of the
> type that caught fire and we've reached out to the vendor who supplied
> them to see about an investigation.


hope things recover: once a rack has overheated you are in trouble.

I know some clusters that keep the ToR switches in middle of the racks for this reason: its less exposed to the hot air near the ceiling, so the most valuable H/W on the rack gets more protection.

As an added benefit: your ether cables are shorter, which, when you go to 4x1 bonded, makes a big difference in cost.

> 
> IST staff will be starting their investigation tomorrow morning, and
> jon or i will post some updates as soon as we get them.
> 
> sorry for the inconvenience,
> 
> shane
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org