You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by shane knapp <sk...@berkeley.edu> on 2015/03/13 20:04:17 UTC

jenkins httpd being flaky

we just started having issues when visiting jenkins and getting 503 service
unavailable errors.

i'm on it and will report back with an all-clear.

Re: jenkins httpd being flaky

Posted by shane knapp <sk...@berkeley.edu>.
ok, things seem to have stabilized...  httpd hasn't flaked since ~noon, the
hanging PRB job on amp-jenkins-worker-06 was removed w/the restart and
things are now building.

i cancelled and retriggered a bunch of PRB builds, btw:
4848 (https://github.com/apache/spark/pull/3699)
5922 (https://github.com/apache/spark/pull/4733)
5987 (https://github.com/apache/spark/pull/4986)
6222 (https://github.com/apache/spark/pull/4964)
6325 (https://github.com/apache/spark/pull/5018)

as well as:
spark-master-maven-with-yarn

sorry for the inconvenience...  i'm still a little stumped as to what
happened, but i think it was a confluence of events (httpd flaking,
problems at github, mercury in retrograde, friday thinking it's monday).

shane

On Fri, Mar 13, 2015 at 1:08 PM, shane knapp <sk...@berkeley.edu> wrote:

> i tried a couple of things, but will also be doing a jenkins reboot as
> soon as the current batch of builds finish.
>
>
>
> On Fri, Mar 13, 2015 at 12:40 PM, shane knapp <sk...@berkeley.edu> wrote:
>
>> ok we have a few different things happening:
>>
>> 1) httpd on the jenkins master is randomly (though not currently) flaking
>> out and causing visits to the site to return a 503.  nothing in the logs
>> shows any problems.
>>
>> 2) there are some github timeouts, which i tracked down and think it's a
>> problem with github themselves (see:  https://status.github.com/ and
>> scroll down to 'mean hook delivery time')
>>
>> 3) we have one spark job w/a strange ivy lock issue, that i just
>> retriggered (https://github.com/apache/spark/pull/4964)
>>
>> 4) there's an errant, unkillable pull request builder job (
>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28574/console
>> )
>>
>> more updates forthcoming.
>>
>> On Fri, Mar 13, 2015 at 12:04 PM, shane knapp <sk...@berkeley.edu>
>> wrote:
>>
>>> we just started having issues when visiting jenkins and getting 503
>>> service unavailable errors.
>>>
>>> i'm on it and will report back with an all-clear.
>>>
>>
>>
>

Re: jenkins httpd being flaky

Posted by shane knapp <sk...@berkeley.edu>.
i tried a couple of things, but will also be doing a jenkins reboot as soon
as the current batch of builds finish.



On Fri, Mar 13, 2015 at 12:40 PM, shane knapp <sk...@berkeley.edu> wrote:

> ok we have a few different things happening:
>
> 1) httpd on the jenkins master is randomly (though not currently) flaking
> out and causing visits to the site to return a 503.  nothing in the logs
> shows any problems.
>
> 2) there are some github timeouts, which i tracked down and think it's a
> problem with github themselves (see:  https://status.github.com/ and
> scroll down to 'mean hook delivery time')
>
> 3) we have one spark job w/a strange ivy lock issue, that i just
> retriggered (https://github.com/apache/spark/pull/4964)
>
> 4) there's an errant, unkillable pull request builder job (
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28574/console
> )
>
> more updates forthcoming.
>
> On Fri, Mar 13, 2015 at 12:04 PM, shane knapp <sk...@berkeley.edu> wrote:
>
>> we just started having issues when visiting jenkins and getting 503
>> service unavailable errors.
>>
>> i'm on it and will report back with an all-clear.
>>
>
>

Re: jenkins httpd being flaky

Posted by shane knapp <sk...@berkeley.edu>.
ok we have a few different things happening:

1) httpd on the jenkins master is randomly (though not currently) flaking
out and causing visits to the site to return a 503.  nothing in the logs
shows any problems.

2) there are some github timeouts, which i tracked down and think it's a
problem with github themselves (see:  https://status.github.com/ and scroll
down to 'mean hook delivery time')

3) we have one spark job w/a strange ivy lock issue, that i just
retriggered (https://github.com/apache/spark/pull/4964)

4) there's an errant, unkillable pull request builder job (
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28574/console
)

more updates forthcoming.

On Fri, Mar 13, 2015 at 12:04 PM, shane knapp <sk...@berkeley.edu> wrote:

> we just started having issues when visiting jenkins and getting 503
> service unavailable errors.
>
> i'm on it and will report back with an all-clear.
>