You are viewing a plain text version of this content. The canonical link for it is here.
Posted to builds@apache.org by Mike Jumper <mi...@guac-dev.org> on 2018/02/02 19:43:23 UTC

Intermittent (and node-specific?) build failures

Hello all,

We (Apache Guacamole) have been having intermittent issues with builds
failing during the initial git checkout (see below), during the creation of
a lock file as Maven tries to pull down a build dependency, etc. Initially,
this seemed tied to node H23, and seeing that other builds explicitly
exclude this node in their label expressions, we have done so as well ...
but these failures still occasionally occur on other nodes.

Is there a better way to defend against this than excluding nodes on a
case-by-case basis?

Many thanks,

- Mike

---------- Forwarded message ----------
From: Mike Jumper <mi...@guac-dev.org>
Date: Wed, Jan 31, 2018 at 8:01 PM
Subject: Re: Build failed in Jenkins: guacamole-server-coverity #16
To: dev@guacamole.apache.org


On Wed, Jan 31, 2018 at 7:47 PM, Apache Jenkins Server
<je...@builds.apache.org> wrote:
> See <https://builds.apache.org/job/guacamole-server-coverity/
16/display/redirect>
>
> ------------------------------------------
> Started by user mjumper
> [EnvInject] - Loading node environment variables.
> Building remotely on H27 (ubuntu xenial) in workspace <
https://builds.apache.org/job/guacamole-server-coverity/ws/>
>  > git rev-parse --is-inside-work-tree # timeout=10
> Fetching changes from the remote Git repository
>  > git config remote.origin.url https://git-wip-us.apache.org/
repos/asf/guacamole-server.git # timeout=10
> ERROR: Error fetching remote repo 'origin'
> hudson.plugins.git.GitException: Failed to fetch from
https://git-wip-us.apache.org/repos/asf/guacamole-server.git
>         at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:825)
>         at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1092)
>         at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1123)
>         at hudson.scm.SCM.checkout(SCM.java:495)
>         at hudson.model.AbstractProject.checkout(AbstractProject.java:
1202)
>         at hudson.model.AbstractBuild$AbstractBuildExecution.
defaultCheckout(AbstractBuild.java:574)
>         at jenkins.scm.SCMCheckoutStrategy.checkout(
SCMCheckoutStrategy.java:86)
>         at hudson.model.AbstractBuild$AbstractBuildExecution.run(
AbstractBuild.java:499)
>         at hudson.model.Run.execute(Run.java:1724)
>         at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
>         at hudson.model.ResourceController.execute(
ResourceController.java:97)
>         at hudson.model.Executor.run(Executor.java:421)
> Caused by: hudson.plugins.git.GitException: Command "git config
remote.origin.url https://git-wip-us.apache.org/
repos/asf/guacamole-server.git" returned status code 4:
> stdout:
> stderr: error: failed to write new configuration file <
https://builds.apache.org/job/guacamole-server-coverity/
ws/guacamole-server/.git/config.lock>
>

I'll ask on builds@apache.org regarding all these inexplicable
failures. I suspect some of the nodes might be having disk space
issues...

- Mike

Re: Intermittent (and node-specific?) build failures

Posted by Mike Jumper <mi...@guac-dev.org>.
OK - I'll remove the exclusions and give these suggestions a shot the next
time we experience such a failure. If changes to the git behavior of the
job don't mitigate things, I'll open an Infra ticket to hopefully chase
down the underlying cause.

Thanks,

- Mike

On Sat, Feb 3, 2018 at 8:34 AM, Chris Lambertus <cm...@apache.org> wrote:

> H27 has plenty of space. Excluding build nodes is usually a bad idea, I’d
> rather we (Infra) work with you to determine the root cause of the
> failures. In this case, it does suggest an out-of-space condition (possibly
> transient,) but it could also be something funky with the git clone. Try
> setting Jenkins to delete the workspace before the build starts and/or set
> Git -> Additional Behaviours -> Wipe out repository & force clone
>
> -Chris
>
>
>
>
> > On Feb 2, 2018, at 11:43 AM, Mike Jumper <mi...@guac-dev.org>
> wrote:
> >
> > Hello all,
> >
> > We (Apache Guacamole) have been having intermittent issues with builds
> > failing during the initial git checkout (see below), during the creation
> of
> > a lock file as Maven tries to pull down a build dependency, etc.
> Initially,
> > this seemed tied to node H23, and seeing that other builds explicitly
> > exclude this node in their label expressions, we have done so as well ...
> > but these failures still occasionally occur on other nodes.
> >
> > Is there a better way to defend against this than excluding nodes on a
> > case-by-case basis?
> >
> > Many thanks,
> >
> > - Mike
> >
> > ---------- Forwarded message ----------
> > From: Mike Jumper <mi...@guac-dev.org>
> > Date: Wed, Jan 31, 2018 at 8:01 PM
> > Subject: Re: Build failed in Jenkins: guacamole-server-coverity #16
> > To: dev@guacamole.apache.org
> >
> >
> > On Wed, Jan 31, 2018 at 7:47 PM, Apache Jenkins Server
> > <je...@builds.apache.org> wrote:
> >> See <https://builds.apache.org/job/guacamole-server-coverity/
> > 16/display/redirect>
> >>
> >> ------------------------------------------
> >> Started by user mjumper
> >> [EnvInject] - Loading node environment variables.
> >> Building remotely on H27 (ubuntu xenial) in workspace <
> > https://builds.apache.org/job/guacamole-server-coverity/ws/>
> >>> git rev-parse --is-inside-work-tree # timeout=10
> >> Fetching changes from the remote Git repository
> >>> git config remote.origin.url https://git-wip-us.apache.org/
> > repos/asf/guacamole-server.git # timeout=10
> >> ERROR: Error fetching remote repo 'origin'
> >> hudson.plugins.git.GitException: Failed to fetch from
> > https://git-wip-us.apache.org/repos/asf/guacamole-server.git
> >>        at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:825)
> >>        at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1092)
> >>        at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1123)
> >>        at hudson.scm.SCM.checkout(SCM.java:495)
> >>        at hudson.model.AbstractProject.checkout(AbstractProject.java:
> > 1202)
> >>        at hudson.model.AbstractBuild$AbstractBuildExecution.
> > defaultCheckout(AbstractBuild.java:574)
> >>        at jenkins.scm.SCMCheckoutStrategy.checkout(
> > SCMCheckoutStrategy.java:86)
> >>        at hudson.model.AbstractBuild$AbstractBuildExecution.run(
> > AbstractBuild.java:499)
> >>        at hudson.model.Run.execute(Run.java:1724)
> >>        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
> >>        at hudson.model.ResourceController.execute(
> > ResourceController.java:97)
> >>        at hudson.model.Executor.run(Executor.java:421)
> >> Caused by: hudson.plugins.git.GitException: Command "git config
> > remote.origin.url https://git-wip-us.apache.org/
> > repos/asf/guacamole-server.git" returned status code 4:
> >> stdout:
> >> stderr: error: failed to write new configuration file <
> > https://builds.apache.org/job/guacamole-server-coverity/
> > ws/guacamole-server/.git/config.lock>
> >>
> >
> > I'll ask on builds@apache.org regarding all these inexplicable
> > failures. I suspect some of the nodes might be having disk space
> > issues...
> >
> > - Mike
>
>

Re: Intermittent (and node-specific?) build failures

Posted by Chris Lambertus <cm...@apache.org>.
H27 has plenty of space. Excluding build nodes is usually a bad idea, I’d rather we (Infra) work with you to determine the root cause of the failures. In this case, it does suggest an out-of-space condition (possibly transient,) but it could also be something funky with the git clone. Try setting Jenkins to delete the workspace before the build starts and/or set Git -> Additional Behaviours -> Wipe out repository & force clone

-Chris




> On Feb 2, 2018, at 11:43 AM, Mike Jumper <mi...@guac-dev.org> wrote:
> 
> Hello all,
> 
> We (Apache Guacamole) have been having intermittent issues with builds
> failing during the initial git checkout (see below), during the creation of
> a lock file as Maven tries to pull down a build dependency, etc. Initially,
> this seemed tied to node H23, and seeing that other builds explicitly
> exclude this node in their label expressions, we have done so as well ...
> but these failures still occasionally occur on other nodes.
> 
> Is there a better way to defend against this than excluding nodes on a
> case-by-case basis?
> 
> Many thanks,
> 
> - Mike
> 
> ---------- Forwarded message ----------
> From: Mike Jumper <mi...@guac-dev.org>
> Date: Wed, Jan 31, 2018 at 8:01 PM
> Subject: Re: Build failed in Jenkins: guacamole-server-coverity #16
> To: dev@guacamole.apache.org
> 
> 
> On Wed, Jan 31, 2018 at 7:47 PM, Apache Jenkins Server
> <je...@builds.apache.org> wrote:
>> See <https://builds.apache.org/job/guacamole-server-coverity/
> 16/display/redirect>
>> 
>> ------------------------------------------
>> Started by user mjumper
>> [EnvInject] - Loading node environment variables.
>> Building remotely on H27 (ubuntu xenial) in workspace <
> https://builds.apache.org/job/guacamole-server-coverity/ws/>
>>> git rev-parse --is-inside-work-tree # timeout=10
>> Fetching changes from the remote Git repository
>>> git config remote.origin.url https://git-wip-us.apache.org/
> repos/asf/guacamole-server.git # timeout=10
>> ERROR: Error fetching remote repo 'origin'
>> hudson.plugins.git.GitException: Failed to fetch from
> https://git-wip-us.apache.org/repos/asf/guacamole-server.git
>>        at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:825)
>>        at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1092)
>>        at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1123)
>>        at hudson.scm.SCM.checkout(SCM.java:495)
>>        at hudson.model.AbstractProject.checkout(AbstractProject.java:
> 1202)
>>        at hudson.model.AbstractBuild$AbstractBuildExecution.
> defaultCheckout(AbstractBuild.java:574)
>>        at jenkins.scm.SCMCheckoutStrategy.checkout(
> SCMCheckoutStrategy.java:86)
>>        at hudson.model.AbstractBuild$AbstractBuildExecution.run(
> AbstractBuild.java:499)
>>        at hudson.model.Run.execute(Run.java:1724)
>>        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
>>        at hudson.model.ResourceController.execute(
> ResourceController.java:97)
>>        at hudson.model.Executor.run(Executor.java:421)
>> Caused by: hudson.plugins.git.GitException: Command "git config
> remote.origin.url https://git-wip-us.apache.org/
> repos/asf/guacamole-server.git" returned status code 4:
>> stdout:
>> stderr: error: failed to write new configuration file <
> https://builds.apache.org/job/guacamole-server-coverity/
> ws/guacamole-server/.git/config.lock>
>> 
> 
> I'll ask on builds@apache.org regarding all these inexplicable
> failures. I suspect some of the nodes might be having disk space
> issues...
> 
> - Mike