You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by shane knapp <sk...@berkeley.edu> on 2015/09/16 17:40:24 UTC

JENKINS: downtime next week, wed and thurs mornings (9-23 and 9-24)

good morning, denizens of the aether!

your hard working build system (and some associated infrastructure)
has been in need of some updates and housecleaning for quite a while
now.  we will be splitting the maintenance over two mornings to
minimize impact.

here's the plan:

7am-9am wednesday, 9-24-15  (or 24-9-15 for those not in amurrica):
* firewall taken offline for system and firewall updates
* expected downtime:  maybe an hour, but we'll say two just in case
* this will be done by jkuroda (CCed on this message)

630am-10am thursday, 9-24-15:
* jenknins update to 1.629 (we're a few months behind in versions, and
some big bugs have been fixed)
* jenkins master and worker system package updates
* all systems get a reboot (lots of hanging java processes have been
building up over the months)
* builds will stop being accepted ~630am, and i'll kill any hangers-on
at 730am, and retrigger once we're done
* expected downtime:  3.5 hours or so
* i will also be testing out some of my shiny new ansible playbooks
for the system updates!


please let me know if you have any questions, or requests to postpone
this maintenance.  thanks in advance!

shane & jon

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: JENKINS: downtime next week, wed and thurs mornings (9-23 and 9-24)

Posted by Reynold Xin <rx...@databricks.com>.
Thanks Shane and Jon for the heads up.

On Wednesday, September 16, 2015, shane knapp <sk...@berkeley.edu> wrote:

> good morning, denizens of the aether!
>
> your hard working build system (and some associated infrastructure)
> has been in need of some updates and housecleaning for quite a while
> now.  we will be splitting the maintenance over two mornings to
> minimize impact.
>
> here's the plan:
>
> 7am-9am wednesday, 9-24-15  (or 24-9-15 for those not in amurrica):
> * firewall taken offline for system and firewall updates
> * expected downtime:  maybe an hour, but we'll say two just in case
> * this will be done by jkuroda (CCed on this message)
>
> 630am-10am thursday, 9-24-15:
> * jenknins update to 1.629 (we're a few months behind in versions, and
> some big bugs have been fixed)
> * jenkins master and worker system package updates
> * all systems get a reboot (lots of hanging java processes have been
> building up over the months)
> * builds will stop being accepted ~630am, and i'll kill any hangers-on
> at 730am, and retrigger once we're done
> * expected downtime:  3.5 hours or so
> * i will also be testing out some of my shiny new ansible playbooks
> for the system updates!
>
>
> please let me know if you have any questions, or requests to postpone
> this maintenance.  thanks in advance!
>
> shane & jon
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org <javascript:;>
> For additional commands, e-mail: dev-help@spark.apache.org <javascript:;>
>
>

Re: JENKINS: downtime next week, wed and thurs mornings (9-23 and 9-24)

Posted by shane knapp <sk...@berkeley.edu>.
...and we're finished and now building!

On Thu, Sep 24, 2015 at 7:19 AM, shane knapp <sk...@berkeley.edu> wrote:
> this is happening now.
>
> On Tue, Sep 22, 2015 at 10:07 AM, shane knapp <sk...@berkeley.edu> wrote:
>> ok, here's the updated downtime schedule for this week:
>>
>> wednesday, sept 23rd:
>>
>> firewall maintenance cancelled, as jon took care of the update
>> saturday morning while we were bringing jenkins back up after the colo
>> fire
>>
>> thursday, sept 24th:
>>
>> jenkins maintenance is still scheduled, but abbreviated as some of the
>> maintenance was performed saturday morning as well
>> * new builds will stop being accepted ~630am PDT
>>   - i'll kill any hangers-on at 730am, and after maintenance is done,
>> i will retrigger any killed jobs
>> * jenkins worker system package updates
>>   - amp-jenkins-master was completed on saturday
>>   - this will NOT include kernel updates as moving to
>> 2.6.32-573.3.1.el6 bricked amp-jenkins-master
>> * moving default system java for builds from jdk1.7.0_71 to jdk1.7.0_79
>> * all systems get a reboot
>> * expected downtime:  3.5 hours or so
>>
>> i'll post updates as i progress.
>>
>> also, i'll post a copy of our post-mortem once the dust settles.  it's
>> been, shall we say, a pretty crazy few days.
>>
>> http://news.berkeley.edu/2015/09/19/campus-network-outage/
>>
>> :)
>>
>> On Mon, Sep 21, 2015 at 10:11 AM, shane knapp <sk...@berkeley.edu> wrote:
>>> quick update:  we actually did some of the maintenance on our systems
>>> after the berkeley-wide outage caused by one of our (non-jenkins)
>>> servers halting and catching fire.
>>>
>>> we'll still have some downtime early wednesday, but tomorrow's will be
>>> cancelled.  i'll send out another update real soon now with what we'll
>>> be covering on wednesday once we get our current situation more under
>>> control.  :)
>>>
>>> On Wed, Sep 16, 2015 at 12:15 PM, shane knapp <sk...@berkeley.edu> wrote:
>>>>> 630am-10am thursday, 9-24-15:
>>>>> * jenknins update to 1.629 (we're a few months behind in versions, and
>>>>> some big bugs have been fixed)
>>>>> * jenkins master and worker system package updates
>>>>> * all systems get a reboot (lots of hanging java processes have been
>>>>> building up over the months)
>>>>> * builds will stop being accepted ~630am, and i'll kill any hangers-on
>>>>> at 730am, and retrigger once we're done
>>>>> * expected downtime:  3.5 hours or so
>>>>> * i will also be testing out some of my shiny new ansible playbooks
>>>>> for the system updates!
>>>>>
>>>> i forgot one thing:
>>>>
>>>> * moving default system java for builds from jdk1.7.0_71 to jdk1.7.0_79

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: JENKINS: downtime next week, wed and thurs mornings (9-23 and 9-24)

Posted by shane knapp <sk...@berkeley.edu>.
this is happening now.

On Tue, Sep 22, 2015 at 10:07 AM, shane knapp <sk...@berkeley.edu> wrote:
> ok, here's the updated downtime schedule for this week:
>
> wednesday, sept 23rd:
>
> firewall maintenance cancelled, as jon took care of the update
> saturday morning while we were bringing jenkins back up after the colo
> fire
>
> thursday, sept 24th:
>
> jenkins maintenance is still scheduled, but abbreviated as some of the
> maintenance was performed saturday morning as well
> * new builds will stop being accepted ~630am PDT
>   - i'll kill any hangers-on at 730am, and after maintenance is done,
> i will retrigger any killed jobs
> * jenkins worker system package updates
>   - amp-jenkins-master was completed on saturday
>   - this will NOT include kernel updates as moving to
> 2.6.32-573.3.1.el6 bricked amp-jenkins-master
> * moving default system java for builds from jdk1.7.0_71 to jdk1.7.0_79
> * all systems get a reboot
> * expected downtime:  3.5 hours or so
>
> i'll post updates as i progress.
>
> also, i'll post a copy of our post-mortem once the dust settles.  it's
> been, shall we say, a pretty crazy few days.
>
> http://news.berkeley.edu/2015/09/19/campus-network-outage/
>
> :)
>
> On Mon, Sep 21, 2015 at 10:11 AM, shane knapp <sk...@berkeley.edu> wrote:
>> quick update:  we actually did some of the maintenance on our systems
>> after the berkeley-wide outage caused by one of our (non-jenkins)
>> servers halting and catching fire.
>>
>> we'll still have some downtime early wednesday, but tomorrow's will be
>> cancelled.  i'll send out another update real soon now with what we'll
>> be covering on wednesday once we get our current situation more under
>> control.  :)
>>
>> On Wed, Sep 16, 2015 at 12:15 PM, shane knapp <sk...@berkeley.edu> wrote:
>>>> 630am-10am thursday, 9-24-15:
>>>> * jenknins update to 1.629 (we're a few months behind in versions, and
>>>> some big bugs have been fixed)
>>>> * jenkins master and worker system package updates
>>>> * all systems get a reboot (lots of hanging java processes have been
>>>> building up over the months)
>>>> * builds will stop being accepted ~630am, and i'll kill any hangers-on
>>>> at 730am, and retrigger once we're done
>>>> * expected downtime:  3.5 hours or so
>>>> * i will also be testing out some of my shiny new ansible playbooks
>>>> for the system updates!
>>>>
>>> i forgot one thing:
>>>
>>> * moving default system java for builds from jdk1.7.0_71 to jdk1.7.0_79

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: JENKINS: downtime next week, wed and thurs mornings (9-23 and 9-24)

Posted by shane knapp <sk...@berkeley.edu>.
ok, here's the updated downtime schedule for this week:

wednesday, sept 23rd:

firewall maintenance cancelled, as jon took care of the update
saturday morning while we were bringing jenkins back up after the colo
fire

thursday, sept 24th:

jenkins maintenance is still scheduled, but abbreviated as some of the
maintenance was performed saturday morning as well
* new builds will stop being accepted ~630am PDT
  - i'll kill any hangers-on at 730am, and after maintenance is done,
i will retrigger any killed jobs
* jenkins worker system package updates
  - amp-jenkins-master was completed on saturday
  - this will NOT include kernel updates as moving to
2.6.32-573.3.1.el6 bricked amp-jenkins-master
* moving default system java for builds from jdk1.7.0_71 to jdk1.7.0_79
* all systems get a reboot
* expected downtime:  3.5 hours or so

i'll post updates as i progress.

also, i'll post a copy of our post-mortem once the dust settles.  it's
been, shall we say, a pretty crazy few days.

http://news.berkeley.edu/2015/09/19/campus-network-outage/

:)

On Mon, Sep 21, 2015 at 10:11 AM, shane knapp <sk...@berkeley.edu> wrote:
> quick update:  we actually did some of the maintenance on our systems
> after the berkeley-wide outage caused by one of our (non-jenkins)
> servers halting and catching fire.
>
> we'll still have some downtime early wednesday, but tomorrow's will be
> cancelled.  i'll send out another update real soon now with what we'll
> be covering on wednesday once we get our current situation more under
> control.  :)
>
> On Wed, Sep 16, 2015 at 12:15 PM, shane knapp <sk...@berkeley.edu> wrote:
>>> 630am-10am thursday, 9-24-15:
>>> * jenknins update to 1.629 (we're a few months behind in versions, and
>>> some big bugs have been fixed)
>>> * jenkins master and worker system package updates
>>> * all systems get a reboot (lots of hanging java processes have been
>>> building up over the months)
>>> * builds will stop being accepted ~630am, and i'll kill any hangers-on
>>> at 730am, and retrigger once we're done
>>> * expected downtime:  3.5 hours or so
>>> * i will also be testing out some of my shiny new ansible playbooks
>>> for the system updates!
>>>
>> i forgot one thing:
>>
>> * moving default system java for builds from jdk1.7.0_71 to jdk1.7.0_79

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: JENKINS: downtime next week, wed and thurs mornings (9-23 and 9-24)

Posted by shane knapp <sk...@berkeley.edu>.
quick update:  we actually did some of the maintenance on our systems
after the berkeley-wide outage caused by one of our (non-jenkins)
servers halting and catching fire.

we'll still have some downtime early wednesday, but tomorrow's will be
cancelled.  i'll send out another update real soon now with what we'll
be covering on wednesday once we get our current situation more under
control.  :)

On Wed, Sep 16, 2015 at 12:15 PM, shane knapp <sk...@berkeley.edu> wrote:
>> 630am-10am thursday, 9-24-15:
>> * jenknins update to 1.629 (we're a few months behind in versions, and
>> some big bugs have been fixed)
>> * jenkins master and worker system package updates
>> * all systems get a reboot (lots of hanging java processes have been
>> building up over the months)
>> * builds will stop being accepted ~630am, and i'll kill any hangers-on
>> at 730am, and retrigger once we're done
>> * expected downtime:  3.5 hours or so
>> * i will also be testing out some of my shiny new ansible playbooks
>> for the system updates!
>>
> i forgot one thing:
>
> * moving default system java for builds from jdk1.7.0_71 to jdk1.7.0_79

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: JENKINS: downtime next week, wed and thurs mornings (9-23 and 9-24)

Posted by shane knapp <sk...@berkeley.edu>.
> 630am-10am thursday, 9-24-15:
> * jenknins update to 1.629 (we're a few months behind in versions, and
> some big bugs have been fixed)
> * jenkins master and worker system package updates
> * all systems get a reboot (lots of hanging java processes have been
> building up over the months)
> * builds will stop being accepted ~630am, and i'll kill any hangers-on
> at 730am, and retrigger once we're done
> * expected downtime:  3.5 hours or so
> * i will also be testing out some of my shiny new ansible playbooks
> for the system updates!
>
i forgot one thing:

* moving default system java for builds from jdk1.7.0_71 to jdk1.7.0_79

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org