You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by Paul Rogers <pr...@cloudera.com> on 2019/01/30 18:01:42 UTC

Jenkins restart

Hi All,

We had a bit of trouble restarting Jenkins: some jobs refused to die after waiting 12 hours. So, killed a number of jobs, restarted Jenkins and restarted the manually-submiitted dry-run jobs. Looks like some full parallel jobs restarted themselves after the restart.

Please review your jobs and restart as needed.

As it turns out, new security alerts came in yesterday while we were fighting this issue so I’ll restart Jenkins again tonight.

Thanks,

- Paul


Re: Jenkins restart

Posted by Philip Zeyliger <ph...@cloudera.com>.
When I last looked, the issue is that those tests use a Jenkins "pipeline."
These have multiple steps (let's call them A, B, C) that run in serial. If
A is running when you quiesce jenkins, B will get queued but never start,
causing the overall job to hang forever.
https://jenkins.impala.io/job/parallel-all-tests/configure does seem to
have a timeout of 10 hours. It looks (
https://jenkins.impala.io/job/parallel-all-tests/4893/console) like that's
been hit, so now we're looking at Jenkins bugs.
https://issues.jenkins-ci.org/browse/JENKINS-40839 is one. Ideally we
wouldn't even be relying on the timeout.

My experiences with "programming via Jenkins" (i.e., using Jenkins jobs as
functions and tying them together) lead me to try to avoid it.

-- Philip

On Thu, Jan 31, 2019 at 10:05 AM Paul Rogers <pr...@cloudera.com> wrote:

> Hi All,
>
> Jenkins is restarted.
>
> Looks like we have two jobs that seemingly run forever: parallel-all-tests
> and parallel-all-tests-nightly. Twice now they kept running for more than
> 12 hours and had to be killed. Strangely, the elapsed runtime never exceeds
> about four hours. Perhaps someone familiar with these jobs can figure out
> what’s what.
>
> Thanks,
>
> > On Jan 30, 2019, at 10:01 AM, Paul Rogers <pr...@cloudera.com> wrote:
> >
> > Hi All,
> >
> > We had a bit of trouble restarting Jenkins: some jobs refused to die
> after waiting 12 hours. So, killed a number of jobs, restarted Jenkins and
> restarted the manually-submiitted dry-run jobs. Looks like some full
> parallel jobs restarted themselves after the restart.
> >
> > Please review your jobs and restart as needed.
> >
> > As it turns out, new security alerts came in yesterday while we were
> fighting this issue so I’ll restart Jenkins again tonight.
> >
> > Thanks,
> >
> > - Paul
> >
>
>

Re: Jenkins restart

Posted by Paul Rogers <pr...@cloudera.com>.
Hi All,

Jenkins is restarted.

Looks like we have two jobs that seemingly run forever: parallel-all-tests and parallel-all-tests-nightly. Twice now they kept running for more than 12 hours and had to be killed. Strangely, the elapsed runtime never exceeds about four hours. Perhaps someone familiar with these jobs can figure out what’s what.

Thanks,

> On Jan 30, 2019, at 10:01 AM, Paul Rogers <pr...@cloudera.com> wrote:
> 
> Hi All,
> 
> We had a bit of trouble restarting Jenkins: some jobs refused to die after waiting 12 hours. So, killed a number of jobs, restarted Jenkins and restarted the manually-submiitted dry-run jobs. Looks like some full parallel jobs restarted themselves after the restart.
> 
> Please review your jobs and restart as needed.
> 
> As it turns out, new security alerts came in yesterday while we were fighting this issue so I’ll restart Jenkins again tonight.
> 
> Thanks,
> 
> - Paul
>