You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficcontrol.apache.org by Zach Hoffman <za...@zrhoffman.net> on 2020/06/29 16:24:10 UTC

Jenkins Jobs container order

Hey ATC,

Can we change the CI jobs (
https://builds.apache.org/search/?q=trafficcontrol) to run weasel after the
other containers exit?

When running against https://github.com/apache/trafficcontrol/pull/4758 ,
weasel fails because it references files that existed when it started but
had since been deleted. Example (search for panic: Failed when enumerating
working directory):
https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText

After speaking to the author of comcast/weasel (alficles), it sounds like
the problem exists in golang's filesystem walker itself, so a fix would
involve rewriting that library and would be easier to just avoid. To avoid
the issue, the non-weasel containers would run before weasel to avoid the
filesystem changing while weasel runs.

Is this doable? The PR job seems to consistently fail for
https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText ,so it
should not be merged until the CI jobs are changed to accommodate weasel.

-Zach

Re: Jenkins Jobs container order

Posted by Zach Hoffman <za...@zrhoffman.net>.
With #4836 merged, the only Jenkins jobs that weasel needs to be removed
from are https://builds.apache.org/job/trafficcontrol-PR/ and
https://builds.apache.org/job/trafficcontrol-master-build/ , right?

-Zach

On Tue, Jun 30, 2020 at 11:16 AM ocket 8888 <oc...@gmail.com> wrote:

> There's a PR open to handle this by running Weasel in GH Actions instead:
> https://github.com/apache/trafficcontrol/pull/4836
>
> After that, though, it would still need to be removed from Jenkins.
>
> On Tue, Jun 30, 2020 at 8:56 AM Chris Lemmons <al...@gmail.com> wrote:
>
> > Yeah, there's basically no reliable way to have weasel return reliable
> > results with a filesystem changing significantly out from under it
> > while it runs. It's possible to reduce the size of the race
> > conditions, but not to remove the race conditions.
> >
> > The best solution is probably to run weasel first or last in the build
> > if you want to run against the same build root, or to run against a
> > separate directory checkout if you're running in parallel.
> >
> > On Mon, Jun 29, 2020 at 12:54 PM Zach Hoffman <za...@zrhoffman.net>
> wrote:
> > >
> > > Yeah that would work. Weasel would show up as a separate checkmark or
> > "X",
> > > so it would be easier to tell when weasel failed or not. If a single
> > check
> > > fails, we get the "X", so it would be hard to miss.
> > >
> > > -Zach
> > >
> > > On Mon, Jun 29, 2020 at 12:49 PM ocket 8888 <oc...@gmail.com>
> wrote:
> > >
> > > > Alternatively it's trivial to run weasel in a GH Action, which would
> be
> > > > totally independent of whatever's happening in Jenkins.
> > > >
> > > > On Mon, Jun 29, 2020 at 10:24 AM Zach Hoffman <za...@zrhoffman.net>
> > wrote:
> > > >
> > > > > Hey ATC,
> > > > >
> > > > > Can we change the CI jobs (
> > > > > https://builds.apache.org/search/?q=trafficcontrol) to run weasel
> > after
> > > > > the
> > > > > other containers exit?
> > > > >
> > > > > When running against
> > https://github.com/apache/trafficcontrol/pull/4758
> > > > ,
> > > > > weasel fails because it references files that existed when it
> > started but
> > > > > had since been deleted. Example (search for panic: Failed when
> > > > enumerating
> > > > > working directory):
> > > > > https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText
> > > > >
> > > > > After speaking to the author of comcast/weasel (alficles), it
> sounds
> > like
> > > > > the problem exists in golang's filesystem walker itself, so a fix
> > would
> > > > > involve rewriting that library and would be easier to just avoid.
> To
> > > > avoid
> > > > > the issue, the non-weasel containers would run before weasel to
> > avoid the
> > > > > filesystem changing while weasel runs.
> > > > >
> > > > > Is this doable? The PR job seems to consistently fail for
> > > > > https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText
> > ,so it
> > > > > should not be merged until the CI jobs are changed to accommodate
> > weasel.
> > > > >
> > > > > -Zach
> > > > >
> > > >
> >
>

Re: Jenkins Jobs container order

Posted by ocket 8888 <oc...@gmail.com>.
There's a PR open to handle this by running Weasel in GH Actions instead:
https://github.com/apache/trafficcontrol/pull/4836

After that, though, it would still need to be removed from Jenkins.

On Tue, Jun 30, 2020 at 8:56 AM Chris Lemmons <al...@gmail.com> wrote:

> Yeah, there's basically no reliable way to have weasel return reliable
> results with a filesystem changing significantly out from under it
> while it runs. It's possible to reduce the size of the race
> conditions, but not to remove the race conditions.
>
> The best solution is probably to run weasel first or last in the build
> if you want to run against the same build root, or to run against a
> separate directory checkout if you're running in parallel.
>
> On Mon, Jun 29, 2020 at 12:54 PM Zach Hoffman <za...@zrhoffman.net> wrote:
> >
> > Yeah that would work. Weasel would show up as a separate checkmark or
> "X",
> > so it would be easier to tell when weasel failed or not. If a single
> check
> > fails, we get the "X", so it would be hard to miss.
> >
> > -Zach
> >
> > On Mon, Jun 29, 2020 at 12:49 PM ocket 8888 <oc...@gmail.com> wrote:
> >
> > > Alternatively it's trivial to run weasel in a GH Action, which would be
> > > totally independent of whatever's happening in Jenkins.
> > >
> > > On Mon, Jun 29, 2020 at 10:24 AM Zach Hoffman <za...@zrhoffman.net>
> wrote:
> > >
> > > > Hey ATC,
> > > >
> > > > Can we change the CI jobs (
> > > > https://builds.apache.org/search/?q=trafficcontrol) to run weasel
> after
> > > > the
> > > > other containers exit?
> > > >
> > > > When running against
> https://github.com/apache/trafficcontrol/pull/4758
> > > ,
> > > > weasel fails because it references files that existed when it
> started but
> > > > had since been deleted. Example (search for panic: Failed when
> > > enumerating
> > > > working directory):
> > > > https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText
> > > >
> > > > After speaking to the author of comcast/weasel (alficles), it sounds
> like
> > > > the problem exists in golang's filesystem walker itself, so a fix
> would
> > > > involve rewriting that library and would be easier to just avoid. To
> > > avoid
> > > > the issue, the non-weasel containers would run before weasel to
> avoid the
> > > > filesystem changing while weasel runs.
> > > >
> > > > Is this doable? The PR job seems to consistently fail for
> > > > https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText
> ,so it
> > > > should not be merged until the CI jobs are changed to accommodate
> weasel.
> > > >
> > > > -Zach
> > > >
> > >
>

Re: Jenkins Jobs container order

Posted by Chris Lemmons <al...@gmail.com>.
Yeah, there's basically no reliable way to have weasel return reliable
results with a filesystem changing significantly out from under it
while it runs. It's possible to reduce the size of the race
conditions, but not to remove the race conditions.

The best solution is probably to run weasel first or last in the build
if you want to run against the same build root, or to run against a
separate directory checkout if you're running in parallel.

On Mon, Jun 29, 2020 at 12:54 PM Zach Hoffman <za...@zrhoffman.net> wrote:
>
> Yeah that would work. Weasel would show up as a separate checkmark or "X",
> so it would be easier to tell when weasel failed or not. If a single check
> fails, we get the "X", so it would be hard to miss.
>
> -Zach
>
> On Mon, Jun 29, 2020 at 12:49 PM ocket 8888 <oc...@gmail.com> wrote:
>
> > Alternatively it's trivial to run weasel in a GH Action, which would be
> > totally independent of whatever's happening in Jenkins.
> >
> > On Mon, Jun 29, 2020 at 10:24 AM Zach Hoffman <za...@zrhoffman.net> wrote:
> >
> > > Hey ATC,
> > >
> > > Can we change the CI jobs (
> > > https://builds.apache.org/search/?q=trafficcontrol) to run weasel after
> > > the
> > > other containers exit?
> > >
> > > When running against https://github.com/apache/trafficcontrol/pull/4758
> > ,
> > > weasel fails because it references files that existed when it started but
> > > had since been deleted. Example (search for panic: Failed when
> > enumerating
> > > working directory):
> > > https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText
> > >
> > > After speaking to the author of comcast/weasel (alficles), it sounds like
> > > the problem exists in golang's filesystem walker itself, so a fix would
> > > involve rewriting that library and would be easier to just avoid. To
> > avoid
> > > the issue, the non-weasel containers would run before weasel to avoid the
> > > filesystem changing while weasel runs.
> > >
> > > Is this doable? The PR job seems to consistently fail for
> > > https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText ,so it
> > > should not be merged until the CI jobs are changed to accommodate weasel.
> > >
> > > -Zach
> > >
> >

Re: Jenkins Jobs container order

Posted by Zach Hoffman <za...@zrhoffman.net>.
Yeah that would work. Weasel would show up as a separate checkmark or "X",
so it would be easier to tell when weasel failed or not. If a single check
fails, we get the "X", so it would be hard to miss.

-Zach

On Mon, Jun 29, 2020 at 12:49 PM ocket 8888 <oc...@gmail.com> wrote:

> Alternatively it's trivial to run weasel in a GH Action, which would be
> totally independent of whatever's happening in Jenkins.
>
> On Mon, Jun 29, 2020 at 10:24 AM Zach Hoffman <za...@zrhoffman.net> wrote:
>
> > Hey ATC,
> >
> > Can we change the CI jobs (
> > https://builds.apache.org/search/?q=trafficcontrol) to run weasel after
> > the
> > other containers exit?
> >
> > When running against https://github.com/apache/trafficcontrol/pull/4758
> ,
> > weasel fails because it references files that existed when it started but
> > had since been deleted. Example (search for panic: Failed when
> enumerating
> > working directory):
> > https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText
> >
> > After speaking to the author of comcast/weasel (alficles), it sounds like
> > the problem exists in golang's filesystem walker itself, so a fix would
> > involve rewriting that library and would be easier to just avoid. To
> avoid
> > the issue, the non-weasel containers would run before weasel to avoid the
> > filesystem changing while weasel runs.
> >
> > Is this doable? The PR job seems to consistently fail for
> > https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText ,so it
> > should not be merged until the CI jobs are changed to accommodate weasel.
> >
> > -Zach
> >
>

Re: Jenkins Jobs container order

Posted by ocket 8888 <oc...@gmail.com>.
Alternatively it's trivial to run weasel in a GH Action, which would be
totally independent of whatever's happening in Jenkins.

On Mon, Jun 29, 2020 at 10:24 AM Zach Hoffman <za...@zrhoffman.net> wrote:

> Hey ATC,
>
> Can we change the CI jobs (
> https://builds.apache.org/search/?q=trafficcontrol) to run weasel after
> the
> other containers exit?
>
> When running against https://github.com/apache/trafficcontrol/pull/4758 ,
> weasel fails because it references files that existed when it started but
> had since been deleted. Example (search for panic: Failed when enumerating
> working directory):
> https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText
>
> After speaking to the author of comcast/weasel (alficles), it sounds like
> the problem exists in golang's filesystem walker itself, so a fix would
> involve rewriting that library and would be easier to just avoid. To avoid
> the issue, the non-weasel containers would run before weasel to avoid the
> filesystem changing while weasel runs.
>
> Is this doable? The PR job seems to consistently fail for
> https://builds.apache.org/job/trafficcontrol-PR/6173/consoleText ,so it
> should not be merged until the CI jobs are changed to accommodate weasel.
>
> -Zach
>