You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Valentyn Tymofieiev <va...@google.com.INVALID> on 2017/10/23 18:45:51 UTC

Java pre/post commit test suite breakage

Hi Beam-Dev,

It's been >5 days since the last successful run of a
beam_PreCommit_Java_MavenInstall build[1]  and >4 days since last
successful run of beam_PreCommit_Java_MavenInstall[2].

Looking at build logs I see following problems.

1. After October 17, postcommit builds started to fail with

Failed to execute goal org.apache.rat:apache-rat-plugin:0.12:check
(default) on project beam-parent: Too many files with unapproved license: 1
See RAT report in: /home/jenkins/jenkins-slave/workspace/beam_PostCommit_
Java_MavenInstall/target/beam-parent-2.3.0-SNAPSHOT.rat

The earliest build that I see this error is Postcommit #5052 [3].

This makes me suspect [4] or [5] as a breaking change, since they change
pom files.

Questions:
- Is there a way we can reproduce this failure locally? mvn clean verify
passes locally for me.
- Is there a way we can see the See RAT report mentioned in the error log?


2. Prior to onset of #1 Java Precommit builds no longer complete within
allotted 150 min time. Looking at [6-8] it seems the build makes consistent
progress, but just does not finish on time. We can also see several recent
successful builds with execution time very close to time out [9-11].

I'd like to propose to increase time limit for Java precommit test suite
from 2.5 to 4 hours. 4 hours is long time. I agree that we should
definitely try to reduce the test execution time, and reduce flakiness.
However we need the tests at least pass for now. If we write off failed
test suites as 'flakes' and merge PRs without having a green test signal,
we will have to spend more time tracing breakages such as #1.

Thoughts?

Thanks,
Valentyn

[1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/
[2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/
[3] https://builds.apache.org/job/beam_PostCommit_Java_
MavenInstall/5052/changes

[4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3bc3c67ba3e261
7cb7f11a8c
[5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b3e9a26b6
c9d0bb6dc3

[6] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15222/
[7] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15195/
[8] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15220/

[9] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15009/
[10] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15068/
[11] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15016/

Re: Java pre/post commit test suite breakage

Posted by Ben Chambers <bc...@google.com.INVALID>.
I believe license errors are detected by Apache RAT, which are part of the
release profile. I believe you need to run "mvn clean verify -Prelease" to
enable anything associated with that profile.

On Mon, Oct 23, 2017 at 11:46 AM Valentyn Tymofieiev
<va...@google.com.invalid> wrote:

> Hi Beam-Dev,
>
> It's been >5 days since the last successful run of a
> beam_PreCommit_Java_MavenInstall build[1]  and >4 days since last
> successful run of beam_PreCommit_Java_MavenInstall[2].
>
> Looking at build logs I see following problems.
>
> 1. After October 17, postcommit builds started to fail with
>
> Failed to execute goal org.apache.rat:apache-rat-plugin:0.12:check
> (default) on project beam-parent: Too many files with unapproved license: 1
> See RAT report in: /home/jenkins/jenkins-slave/workspace/beam_PostCommit_
> Java_MavenInstall/target/beam-parent-2.3.0-SNAPSHOT.rat
>
> The earliest build that I see this error is Postcommit #5052 [3].
>
> This makes me suspect [4] or [5] as a breaking change, since they change
> pom files.
>
> Questions:
> - Is there a way we can reproduce this failure locally? mvn clean verify
> passes locally for me.
> - Is there a way we can see the See RAT report mentioned in the error log?
>
>
> 2. Prior to onset of #1 Java Precommit builds no longer complete within
> allotted 150 min time. Looking at [6-8] it seems the build makes consistent
> progress, but just does not finish on time. We can also see several recent
> successful builds with execution time very close to time out [9-11].
>
> I'd like to propose to increase time limit for Java precommit test suite
> from 2.5 to 4 hours. 4 hours is long time. I agree that we should
> definitely try to reduce the test execution time, and reduce flakiness.
> However we need the tests at least pass for now. If we write off failed
> test suites as 'flakes' and merge PRs without having a green test signal,
> we will have to spend more time tracing breakages such as #1.
>
> Thoughts?
>
> Thanks,
> Valentyn
>
> [1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/
> [2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/
> [3] https://builds.apache.org/job/beam_PostCommit_Java_
> MavenInstall/5052/changes
>
> [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3bc3c67ba3e261
> 7cb7f11a8c
> <https://github.com/apache/beam/commit/d745cc9d8cc1735d3bc3c67ba3e2617cb7f11a8c>
> [5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b3e9a26b6
> c9d0bb6dc3
> <https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b3e9a26b6c9d0bb6dc3>
>
> [6] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15222/
> [7] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15195/
> [8] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15220/
>
> [9] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15009/
> [10] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15068/
> [11] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15016/
>

Re: Java pre/post commit test suite breakage

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
Agreed. We have to have green to make progress. The timeout should be there
only to halt a build that is a true failure.

It looks like the RAT check is rooted in the current working directory
rather than relative to the pom.xml. That's either our bug or RAT's, since
it makes no sense as a default behavior, but I'll find a way to fix it
either way.

On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <va...@google.com>
wrote:

> Hi Beam-Dev,
>
> It's been >5 days since the last successful run of a
> beam_PreCommit_Java_MavenInstall build[1]  and >4 days since last
> successful run of beam_PreCommit_Java_MavenInstall[2].
>
> Looking at build logs I see following problems.
>
> 1. After October 17, postcommit builds started to fail with
>
> Failed to execute goal org.apache.rat:apache-rat-plugin:0.12:check
> (default) on project beam-parent: Too many files with unapproved license: 1
> See RAT report in: /home/jenkins/jenkins-slave/wo
> rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
> t-2.3.0-SNAPSHOT.rat
>
> The earliest build that I see this error is Postcommit #5052 [3].
>
> This makes me suspect [4] or [5] as a breaking change, since they change
> pom files.
>
> Questions:
> - Is there a way we can reproduce this failure locally? mvn clean verify
> passes locally for me.
> - Is there a way we can see the See RAT report mentioned in the error
> log?
>
> 2. Prior to onset of #1 Java Precommit builds no longer complete within
> allotted 150 min time. Looking at [6-8] it seems the build makes consistent
> progress, but just does not finish on time. We can also see several recent
> successful builds with execution time very close to time out [9-11].
>
> I'd like to propose to increase time limit for Java precommit test suite
> from 2.5 to 4 hours. 4 hours is long time. I agree that we should
> definitely try to reduce the test execution time, and reduce flakiness.
> However we need the tests at least pass for now. If we write off failed
> test suites as 'flakes' and merge PRs without having a green test signal,
> we will have to spend more time tracing breakages such as #1.
>
> Thoughts?
>
> Thanks,
> Valentyn
>
> [1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/
> [2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/
> [3] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
> all/5052/changes
>
> [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
> c3c67ba3e2617cb7f11a8c
> [5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b
> 3e9a26b6c9d0bb6dc3
>
> [6] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15222/
> [7] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15195/
> [8] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15220/
>
> [9] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15009/
> [10] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15068/
> [11] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15016/
>
>

Re: Java pre/post commit test suite breakage

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
On Wed, Oct 25, 2017 at 3:59 PM, Lukasz Cwik <lc...@google.com.invalid>
wrote:

> Another suggestion would be to break apart the project into multiple maven
> projects which are released on independent schedules.
>

I think at some point this makes a lot of sense. Libraries of transforms
built against a stable SDK like Java should be OK, and this is a big chunk
of the build time. Fine to move to postcommit / selective precommit too.

Kenn



> Does anyone have any data points for what has worked for other open source
> communities in the past?
> I only have anecdotal evidence that Apache Ant+Ivy has worked well as it
> was much faster then Apache Maven.
>
> On Wed, Oct 25, 2017 at 1:03 PM, Kenneth Knowles <kl...@google.com.invalid>
> wrote:
>
> > Hi all,
> >
> > I wanted to circle back on this. We've had continual quasi-outages, where
> > Jenkins occasionally schedules jobs but not reliably. Meanwhile, many of
> > our most important jobs continued to hit their timeouts when they did get
> > scheduled. On top of that, I've run the wrong command once before
> merging a
> > PR, resulting in a couple broken precommit runs.
> >
> > Just so everyone knows the actions that have been taken and what we still
> > need to do:
> >
> >  - Java postcommit and precommit timeouts bumped to 240 minutes (!!)
> >  - Seed job made independent of other jobs, so we can experiment safely
> >  - RAT config fixed
> >
> > The latency of precommit is still prohibitive for effective use. Not only
> > are we waiting too long, but the number of workers needed to avoid
> backlog
> > is excessive. Here are some ideas for things we can do:
> >
> >  - Proceed further on the pipeline job, which is what we want to do long
> > term
> >  - Precommit: run only on demand, so workers are not congested by
> automatic
> > builds; we could also pick and choose what we want to run
> >  - Consider non-Maven build orchestrators that can do dependency-driven
> > builds
> >  - Precommit: run fewer tests; I think this would mean leaving out some
> > modules that we are OK finding issues in the postcommit, or making them
> on
> > demand only
> >  - ValidatesRunner tests: Instead of running a single mavenJob, run
> > multiple, once which just installs while skipping everything else, then
> > follow it with just running the tests we care about
> >  - Examples integration tests: separate from the maven precommit; also
> run
> > as sequenced invocations
> >
> > These are just some ideas; I honestly don't know. I think my upper limit
> > for default precommit feedback is probably 30 minutes, and even there I
> am
> > not very happy.
> >
> > Any other suggestions?
> >
> > Kenn
> >
> > On Mon, Oct 23, 2017 at 3:29 PM, Kenneth Knowles <kl...@google.com> wrote:
> >
> > > I want to wait and get some green from Jenkins running against the HEAD
> > > groovy scripts to confirm. I haven't sat at my desk long enough to see
> a
> > > full `mvn -P release clean verify` yet.
> > >
> > > On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <kl...@google.com>
> wrote:
> > >
> > >> I was easily able to reproduce all the sorts of failures (and some
> > more!)
> > >>
> > >> Here are some things that now work that didn't work, or didn't work
> > >> correctly, before
> > >>
> > >>  - mvn apache-rat:check
> > >>  - mvn -P release apache-rat:check
> > >>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
> > >>
> > >> It turns out we had these issues:
> > >>
> > >>  - items in .gitignore that our RAT config did not ignore
> > >>  - sub-modules actually *did* inherit the RAT config, depending on the
> > >> command you run
> > >>  - paths in the RAT exclude were relative to current dir
> > >>  - paths in our RAT exclude that aren't even part of our codebase or
> > >> generated targets, but just things that Jenkins dropped in the
> > workspace,
> > >> and we were cloning directly into the workspace
> > >>
> > >> I think with the new Jenkins config and the new pom.xml things should
> > >> work well.
> > >>
> > >> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do
> not
> > >> take effect from being merged but only when the seed job next runs.
> > (TODO:
> > >> fix this). I am still trying to get the relevant Jenkins UI pages to
> > load
> > >> to get your change incorporated into the live scripts, which were
> > regressed
> > >> by being run from an old PR (TODO: fix the fact that this can happen).
> > >>
> > >> Kenn
> > >>
> > >> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
> > >> valentyn@google.com.invalid> wrote:
> > >>
> > >>> Thanks a lot!
> > >>>
> > >>> Kenn, Were you able to reproduce RAT failures and test the fix
> > locally? I
> > >>> think "mvn clean verify -Prelease" still passes for me.
> > >>>
> > >>> Timeout for the test suite has been increased in
> > >>> https://github.com/apache/
> > >>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
> > >>>
> > >>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles
> > <klk@google.com.invalid
> > >>> >
> > >>> wrote:
> > >>>
> > >>> > Wrong link - https://github.com/apache/beam/pull/4027
> > >>> >
> > >>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com>
> > >>> wrote:
> > >>> >
> > >>> > > Yea, root cause is the config bug I described. Proposed fix at
> > >>> > > https://github.com/apache/beam/pull/4019/files. I'm working with
> > >>> infra
> > >>> > to
> > >>> > > sort out other build issues that are probably not related.
> > >>> > >
> > >>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik
> > >>> <lc...@google.com.invalid>
> > >>> > > wrote:
> > >>> > >
> > >>> > >> The build breakage I outlined is being tracked in
> > >>> > >> https://issues.apache.org/jira/browse/BEAM-3092
> > >>> > >>
> > >>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lcwik@google.com
> >
> > >>> wrote:
> > >>> > >>
> > >>> > >> > Another breaking change was caused by
> > https://github.com/apache/
> > >>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because
> it
> > >>> > changed
> > >>> > >> > the build layout on the Jenkins server and our RAT rules
> > expected
> > >>> to
> > >>> > >> apply
> > >>> > >> > from a root directory. I pinged Kenneth Knowles about it and
> he
> > >>> said
> > >>> > he
> > >>> > >> was
> > >>> > >> > taking a look.
> > >>> > >> >
> > >>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
> > >>> > >> <rangadi@google.com.invalid
> > >>> > >> > > wrote:
> > >>> > >> >
> > >>> > >> >> Regd (1) :
> > >>> > >> >>
> > >>> > >> >> [4] did have have a file without Apache Licence. It was fixed
> > the
> > >>> > next
> > >>> > >> >> day (
> > >>> > >> >> commit
> > >>> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
> > >>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
> > >>> > >> >> thanks to Ken Knowles who pinged me about it.
> > >>> > >> >>
> > >>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
> > >>> > >> >> valentyn@google.com>
> > >>> > >> >> wrote:
> > >>> > >> >>
> > >>> > >> >> > Hi Beam-Dev,
> > >>> > >> >> >
> > >>> > >> >> > It's been >5 days since the last successful run of a
> > >>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days
> since
> > >>> last
> > >>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
> > >>> > >> >> >
> > >>> > >> >> > Looking at build logs I see following problems.
> > >>> > >> >> >
> > >>> > >> >> > 1. After October 17, postcommit builds started to fail with
> > >>> > >> >> >
> > >>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
> > >>> in:0.12:check
> > >>> > >> >> > (default) on project beam-parent: Too many files with
> > >>> unapproved
> > >>> > >> >> license: 1
> > >>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
> > >>> > >> >> > rkspace/beam_PostCommit_Java_
> MavenInstall/target/beam-paren
> > >>> > >> >> > t-2.3.0-SNAPSHOT.rat
> > >>> > >> >> >
> > >>> > >> >> > The earliest build that I see this error is Postcommit
> #5052
> > >>> [3].
> > >>> > >> >> >
> > >>> > >> >> > This makes me suspect [4] or [5] as a breaking change,
> since
> > >>> they
> > >>> > >> change
> > >>> > >> >> > pom files.
> > >>> > >> >> >
> > >>> > >> >> > Questions:
> > >>> > >> >> > - Is there a way we can reproduce this failure locally? mvn
> > >>> clean
> > >>> > >> verify
> > >>> > >> >> > passes locally for me.
> > >>> > >> >> > - Is there a way we can see the See RAT report mentioned in
> > the
> > >>> > error
> > >>> > >> >> > log?
> > >>> > >> >> >
> > >>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer
> > >>> complete
> > >>> > >> within
> > >>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build
> > >>> makes
> > >>> > >> >> consistent
> > >>> > >> >> > progress, but just does not finish on time. We can also see
> > >>> several
> > >>> > >> >> recent
> > >>> > >> >> > successful builds with execution time very close to time
> out
> > >>> > [9-11].
> > >>> > >> >> >
> > >>> > >> >> > I'd like to propose to increase time limit for Java
> precommit
> > >>> test
> > >>> > >> suite
> > >>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
> > >>> should
> > >>> > >> >> > definitely try to reduce the test execution time, and
> reduce
> > >>> > >> flakiness.
> > >>> > >> >> > However we need the tests at least pass for now. If we
> write
> > >>> off
> > >>> > >> failed
> > >>> > >> >> > test suites as 'flakes' and merge PRs without having a
> green
> > >>> test
> > >>> > >> >> signal,
> > >>> > >> >> > we will have to spend more time tracing breakages such as
> #1.
> > >>> > >> >> >
> > >>> > >> >> > Thoughts?
> > >>> > >> >> >
> > >>> > >> >> > Thanks,
> > >>> > >> >> > Valentyn
> > >>> > >> >> >
> > >>> > >> >> > [1] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > ll/
> > >>> > >> >> > [2] https://builds.apache.org/job/
> > >>> beam_PostCommit_Java_MavenInst
> > >>> > all/
> > >>> > >> >> > [3] https://builds.apache.org/job/
> > >>> beam_PostCommit_Java_MavenInst
> > >>> > >> >> > all/5052/changes
> > >>> > >> >> >
> > >>> > >> >> > [4] https://github.com/apache/
> beam/commit/d745cc9d8cc1735d3b
> > >>> > >> >> > c3c67ba3e2617cb7f11a8c
> > >>> > >> >> > [5] https://github.com/apache/beam
> > >>> /commit/0d8ab6cbbc762dd9f9be1b
> > >>> > >> >> > 3e9a26b6c9d0bb6dc3
> > >>> > >> >> >
> > >>> > >> >> > [6] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15222/
> > >>> > >> >> > [7] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15195/
> > >>> > >> >> > [8] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15220/
> > >>> > >> >> >
> > >>> > >> >> > [9] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15009/
> > >>> > >> >> > [10] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15068/
> > >>> > >> >> > [11] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15016/
> > >>> > >> >> >
> > >>> > >> >> >
> > >>> > >> >>
> > >>> > >> >
> > >>> > >> >
> > >>> > >>
> > >>> > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: Java pre/post commit test suite breakage

Posted by Lukasz Cwik <lc...@google.com.INVALID>.
Another suggestion would be to break apart the project into multiple maven
projects which are released on independent schedules.

Does anyone have any data points for what has worked for other open source
communities in the past?
I only have anecdotal evidence that Apache Ant+Ivy has worked well as it
was much faster then Apache Maven.

On Wed, Oct 25, 2017 at 1:03 PM, Kenneth Knowles <kl...@google.com.invalid>
wrote:

> Hi all,
>
> I wanted to circle back on this. We've had continual quasi-outages, where
> Jenkins occasionally schedules jobs but not reliably. Meanwhile, many of
> our most important jobs continued to hit their timeouts when they did get
> scheduled. On top of that, I've run the wrong command once before merging a
> PR, resulting in a couple broken precommit runs.
>
> Just so everyone knows the actions that have been taken and what we still
> need to do:
>
>  - Java postcommit and precommit timeouts bumped to 240 minutes (!!)
>  - Seed job made independent of other jobs, so we can experiment safely
>  - RAT config fixed
>
> The latency of precommit is still prohibitive for effective use. Not only
> are we waiting too long, but the number of workers needed to avoid backlog
> is excessive. Here are some ideas for things we can do:
>
>  - Proceed further on the pipeline job, which is what we want to do long
> term
>  - Precommit: run only on demand, so workers are not congested by automatic
> builds; we could also pick and choose what we want to run
>  - Consider non-Maven build orchestrators that can do dependency-driven
> builds
>  - Precommit: run fewer tests; I think this would mean leaving out some
> modules that we are OK finding issues in the postcommit, or making them on
> demand only
>  - ValidatesRunner tests: Instead of running a single mavenJob, run
> multiple, once which just installs while skipping everything else, then
> follow it with just running the tests we care about
>  - Examples integration tests: separate from the maven precommit; also run
> as sequenced invocations
>
> These are just some ideas; I honestly don't know. I think my upper limit
> for default precommit feedback is probably 30 minutes, and even there I am
> not very happy.
>
> Any other suggestions?
>
> Kenn
>
> On Mon, Oct 23, 2017 at 3:29 PM, Kenneth Knowles <kl...@google.com> wrote:
>
> > I want to wait and get some green from Jenkins running against the HEAD
> > groovy scripts to confirm. I haven't sat at my desk long enough to see a
> > full `mvn -P release clean verify` yet.
> >
> > On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <kl...@google.com> wrote:
> >
> >> I was easily able to reproduce all the sorts of failures (and some
> more!)
> >>
> >> Here are some things that now work that didn't work, or didn't work
> >> correctly, before
> >>
> >>  - mvn apache-rat:check
> >>  - mvn -P release apache-rat:check
> >>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
> >>
> >> It turns out we had these issues:
> >>
> >>  - items in .gitignore that our RAT config did not ignore
> >>  - sub-modules actually *did* inherit the RAT config, depending on the
> >> command you run
> >>  - paths in the RAT exclude were relative to current dir
> >>  - paths in our RAT exclude that aren't even part of our codebase or
> >> generated targets, but just things that Jenkins dropped in the
> workspace,
> >> and we were cloning directly into the workspace
> >>
> >> I think with the new Jenkins config and the new pom.xml things should
> >> work well.
> >>
> >> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do not
> >> take effect from being merged but only when the seed job next runs.
> (TODO:
> >> fix this). I am still trying to get the relevant Jenkins UI pages to
> load
> >> to get your change incorporated into the live scripts, which were
> regressed
> >> by being run from an old PR (TODO: fix the fact that this can happen).
> >>
> >> Kenn
> >>
> >> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
> >> valentyn@google.com.invalid> wrote:
> >>
> >>> Thanks a lot!
> >>>
> >>> Kenn, Were you able to reproduce RAT failures and test the fix
> locally? I
> >>> think "mvn clean verify -Prelease" still passes for me.
> >>>
> >>> Timeout for the test suite has been increased in
> >>> https://github.com/apache/
> >>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
> >>>
> >>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles
> <klk@google.com.invalid
> >>> >
> >>> wrote:
> >>>
> >>> > Wrong link - https://github.com/apache/beam/pull/4027
> >>> >
> >>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com>
> >>> wrote:
> >>> >
> >>> > > Yea, root cause is the config bug I described. Proposed fix at
> >>> > > https://github.com/apache/beam/pull/4019/files. I'm working with
> >>> infra
> >>> > to
> >>> > > sort out other build issues that are probably not related.
> >>> > >
> >>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik
> >>> <lc...@google.com.invalid>
> >>> > > wrote:
> >>> > >
> >>> > >> The build breakage I outlined is being tracked in
> >>> > >> https://issues.apache.org/jira/browse/BEAM-3092
> >>> > >>
> >>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com>
> >>> wrote:
> >>> > >>
> >>> > >> > Another breaking change was caused by
> https://github.com/apache/
> >>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it
> >>> > changed
> >>> > >> > the build layout on the Jenkins server and our RAT rules
> expected
> >>> to
> >>> > >> apply
> >>> > >> > from a root directory. I pinged Kenneth Knowles about it and he
> >>> said
> >>> > he
> >>> > >> was
> >>> > >> > taking a look.
> >>> > >> >
> >>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
> >>> > >> <rangadi@google.com.invalid
> >>> > >> > > wrote:
> >>> > >> >
> >>> > >> >> Regd (1) :
> >>> > >> >>
> >>> > >> >> [4] did have have a file without Apache Licence. It was fixed
> the
> >>> > next
> >>> > >> >> day (
> >>> > >> >> commit
> >>> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
> >>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
> >>> > >> >> thanks to Ken Knowles who pinged me about it.
> >>> > >> >>
> >>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
> >>> > >> >> valentyn@google.com>
> >>> > >> >> wrote:
> >>> > >> >>
> >>> > >> >> > Hi Beam-Dev,
> >>> > >> >> >
> >>> > >> >> > It's been >5 days since the last successful run of a
> >>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since
> >>> last
> >>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
> >>> > >> >> >
> >>> > >> >> > Looking at build logs I see following problems.
> >>> > >> >> >
> >>> > >> >> > 1. After October 17, postcommit builds started to fail with
> >>> > >> >> >
> >>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
> >>> in:0.12:check
> >>> > >> >> > (default) on project beam-parent: Too many files with
> >>> unapproved
> >>> > >> >> license: 1
> >>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
> >>> > >> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
> >>> > >> >> > t-2.3.0-SNAPSHOT.rat
> >>> > >> >> >
> >>> > >> >> > The earliest build that I see this error is Postcommit #5052
> >>> [3].
> >>> > >> >> >
> >>> > >> >> > This makes me suspect [4] or [5] as a breaking change, since
> >>> they
> >>> > >> change
> >>> > >> >> > pom files.
> >>> > >> >> >
> >>> > >> >> > Questions:
> >>> > >> >> > - Is there a way we can reproduce this failure locally? mvn
> >>> clean
> >>> > >> verify
> >>> > >> >> > passes locally for me.
> >>> > >> >> > - Is there a way we can see the See RAT report mentioned in
> the
> >>> > error
> >>> > >> >> > log?
> >>> > >> >> >
> >>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer
> >>> complete
> >>> > >> within
> >>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build
> >>> makes
> >>> > >> >> consistent
> >>> > >> >> > progress, but just does not finish on time. We can also see
> >>> several
> >>> > >> >> recent
> >>> > >> >> > successful builds with execution time very close to time out
> >>> > [9-11].
> >>> > >> >> >
> >>> > >> >> > I'd like to propose to increase time limit for Java precommit
> >>> test
> >>> > >> suite
> >>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
> >>> should
> >>> > >> >> > definitely try to reduce the test execution time, and reduce
> >>> > >> flakiness.
> >>> > >> >> > However we need the tests at least pass for now. If we write
> >>> off
> >>> > >> failed
> >>> > >> >> > test suites as 'flakes' and merge PRs without having a green
> >>> test
> >>> > >> >> signal,
> >>> > >> >> > we will have to spend more time tracing breakages such as #1.
> >>> > >> >> >
> >>> > >> >> > Thoughts?
> >>> > >> >> >
> >>> > >> >> > Thanks,
> >>> > >> >> > Valentyn
> >>> > >> >> >
> >>> > >> >> > [1] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > ll/
> >>> > >> >> > [2] https://builds.apache.org/job/
> >>> beam_PostCommit_Java_MavenInst
> >>> > all/
> >>> > >> >> > [3] https://builds.apache.org/job/
> >>> beam_PostCommit_Java_MavenInst
> >>> > >> >> > all/5052/changes
> >>> > >> >> >
> >>> > >> >> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
> >>> > >> >> > c3c67ba3e2617cb7f11a8c
> >>> > >> >> > [5] https://github.com/apache/beam
> >>> /commit/0d8ab6cbbc762dd9f9be1b
> >>> > >> >> > 3e9a26b6c9d0bb6dc3
> >>> > >> >> >
> >>> > >> >> > [6] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15222/
> >>> > >> >> > [7] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15195/
> >>> > >> >> > [8] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15220/
> >>> > >> >> >
> >>> > >> >> > [9] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15009/
> >>> > >> >> > [10] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15068/
> >>> > >> >> > [11] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15016/
> >>> > >> >> >
> >>> > >> >> >
> >>> > >> >>
> >>> > >> >
> >>> > >> >
> >>> > >>
> >>> > >
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>

Re: Java pre/post commit test suite breakage

Posted by Lukasz Cwik <lc...@google.com.INVALID>.
I think Ken meant to say Guava 20 but the point is still valid.

On Thu, Oct 26, 2017 at 8:02 PM, Kenneth Knowles <kl...@google.com.invalid>
wrote:

> On Thu, Oct 26, 2017 at 9:14 AM, Kenneth Knowles <kl...@google.com> wrote:
>
> > On Wed, Oct 25, 2017 at 6:51 PM, Robert Bradshaw <
> > robertwb@google.com.invalid> wrote:
> >>
> >> These are just some ideas; I honestly don't know. I think my upper limit
> >> for default precommit feedback is probably 30 minutes, and even there I
> am
> >> not very happy.
> >>
> >>
> >> We should be able to get a good signal in less than 30 minutes. Lint,
> >> smoke
> >> tests, etc. I think there is a more extensive set of tests that should
> >> pass
> >> before merging.
> >>
> >
> > Yea, definitely. Digging into this in particular because I think it is
> > quite promising as a way to reduce build congestion.
> >
> > It looks like `mvn clean test -Dfindbugs.skip` would finish in 10 minutes
> > if we fixed it. Currently it looks like some modules don't work under
> this
> > invocation, probably requiring some state to be established by prior bits
> > run by `verify`. If we iterated on that until review was done, then 30-90
> > minutes to prevent breakage would be fine, especially since a failure in
> > that longer slower build is probably a signal to add a quicker repro unit
> > test.
> >
>
> Looked into this, and some IOs explicitly use Guava 19, so presumably that
> is important, while the core Java SDK requires Guava 21. This is shaded
> away during `mvn verify` or `mvn install` but for `mvn test` there is no
> packaging or shading, so when the IO tests run with Guava 19 taking
> precedence, checkArguments within the Java SDK fail. This one might be
> fixable, but my 5 second takeaway is that our codebase cannot build against
> itself without some shading (or equivalent).
>
> Kenn
>
>
>
>
>
> >
> >>
> >> Also it might not hurt to audit our tests and see if there are a small
> >> number that take a disproportionate amount of time.
> >>
> >> Any other suggestions?
> >>
> >> Kenn
> >>
> >> On Mon, Oct 23, 2017 at 3:29 PM, Kenneth Knowles <kl...@google.com>
> wrote:
> >>
> >> > I want to wait and get some green from Jenkins running against the
> HEAD
> >> > groovy scripts to confirm. I haven't sat at my desk long enough to
> see a
> >> > full `mvn -P release clean verify` yet.
> >> >
> >> > On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <kl...@google.com>
> >> wrote:
> >> >
> >> >> I was easily able to reproduce all the sorts of failures (and some
> >> more!)
> >> >>
> >> >> Here are some things that now work that didn't work, or didn't work
> >> >> correctly, before
> >> >>
> >> >>  - mvn apache-rat:check
> >> >>  - mvn -P release apache-rat:check
> >> >>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
> >> >>
> >> >> It turns out we had these issues:
> >> >>
> >> >>  - items in .gitignore that our RAT config did not ignore
> >> >>  - sub-modules actually *did* inherit the RAT config, depending on
> the
> >> >> command you run
> >> >>  - paths in the RAT exclude were relative to current dir
> >> >>  - paths in our RAT exclude that aren't even part of our codebase or
> >> >> generated targets, but just things that Jenkins dropped in the
> >> workspace,
> >> >> and we were cloning directly into the workspace
> >> >>
> >> >> I think with the new Jenkins config and the new pom.xml things should
> >> >> work well.
> >> >>
> >> >> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do
> >> not
> >> >> take effect from being merged but only when the seed job next runs.
> >> (TODO:
> >> >> fix this). I am still trying to get the relevant Jenkins UI pages to
> >> load
> >> >> to get your change incorporated into the live scripts, which were
> >> regressed
> >> >> by being run from an old PR (TODO: fix the fact that this can
> happen).
> >> >>
> >> >> Kenn
> >> >>
> >> >> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
> >> >> valentyn@google.com.invalid> wrote:
> >> >>
> >> >>> Thanks a lot!
> >> >>>
> >> >>> Kenn, Were you able to reproduce RAT failures and test the fix
> >> locally?
> >> I
> >> >>> think "mvn clean verify -Prelease" still passes for me.
> >> >>>
> >> >>> Timeout for the test suite has been increased in
> >> >>> https://github.com/apache/
> >> >>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
> >> >>>
> >> >>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles
> >> <klk@google.com.invalid
> >> >>> >
> >> >>> wrote:
> >> >>>
> >> >>> > Wrong link - https://github.com/apache/beam/pull/4027
> >> >>> >
> >> >>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com>
> >> >>> wrote:
> >> >>> >
> >> >>> > > Yea, root cause is the config bug I described. Proposed fix at
> >> >>> > > https://github.com/apache/beam/pull/4019/files. I'm working
> with
> >> >>> infra
> >> >>> > to
> >> >>> > > sort out other build issues that are probably not related.
> >> >>> > >
> >> >>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik
> >> >>> <lc...@google.com.invalid>
> >> >>> > > wrote:
> >> >>> > >
> >> >>> > >> The build breakage I outlined is being tracked in
> >> >>> > >> https://issues.apache.org/jira/browse/BEAM-3092
> >> >>> > >>
> >> >>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <
> lcwik@google.com>
> >> >>> wrote:
> >> >>> > >>
> >> >>> > >> > Another breaking change was caused by
> >> https://github.com/apache/
> >> >>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because
> >> it
> >> >>> > changed
> >> >>> > >> > the build layout on the Jenkins server and our RAT rules
> >> expected
> >> >>> to
> >> >>> > >> apply
> >> >>> > >> > from a root directory. I pinged Kenneth Knowles about it and
> he
> >> >>> said
> >> >>> > he
> >> >>> > >> was
> >> >>> > >> > taking a look.
> >> >>> > >> >
> >> >>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
> >> >>> > >> <rangadi@google.com.invalid
> >> >>> > >> > > wrote:
> >> >>> > >> >
> >> >>> > >> >> Regd (1) :
> >> >>> > >> >>
> >> >>> > >> >> [4] did have have a file without Apache Licence. It was
> fixed
> >> the
> >> >>> > next
> >> >>> > >> >> day (
> >> >>> > >> >> commit
> >> >>> > >> >> <https://github.com/apache/beam/commit/
> 249da9b8a1e86d0fe4c3d
> >> >>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b76
> 7a>),
> >> >>> > >> >> thanks to Ken Knowles who pinged me about it.
> >> >>> > >> >>
> >> >>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
> >> >>> > >> >> valentyn@google.com>
> >> >>> > >> >> wrote:
> >> >>> > >> >>
> >> >>> > >> >> > Hi Beam-Dev,
> >> >>> > >> >> >
> >> >>> > >> >> > It's been >5 days since the last successful run of a
> >> >>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days
> >> since
> >> >>> last
> >> >>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
> >> >>> > >> >> >
> >> >>> > >> >> > Looking at build logs I see following problems.
> >> >>> > >> >> >
> >> >>> > >> >> > 1. After October 17, postcommit builds started to fail
> with
> >> >>> > >> >> >
> >> >>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
> >> >>> in:0.12:check
> >> >>> > >> >> > (default) on project beam-parent: Too many files with
> >> >>> unapproved
> >> >>> > >> >> license: 1
> >> >>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
> >> >>> > >> >> > rkspace/beam_PostCommit_Java_
> MavenInstall/target/beam-paren
> >> >>> > >> >> > t-2.3.0-SNAPSHOT.rat
> >> >>> > >> >> >
> >> >>> > >> >> > The earliest build that I see this error is Postcommit
> #5052
> >> >>> [3].
> >> >>> > >> >> >
> >> >>> > >> >> > This makes me suspect [4] or [5] as a breaking change,
> since
> >> >>> they
> >> >>> > >> change
> >> >>> > >> >> > pom files.
> >> >>> > >> >> >
> >> >>> > >> >> > Questions:
> >> >>> > >> >> > - Is there a way we can reproduce this failure locally?
> mvn
> >> >>> clean
> >> >>> > >> verify
> >> >>> > >> >> > passes locally for me.
> >> >>> > >> >> > - Is there a way we can see the See RAT report mentioned
> in
> >> the
> >> >>> > error
> >> >>> > >> >> > log?
> >> >>> > >> >> >
> >> >>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer
> >> >>> complete
> >> >>> > >> within
> >> >>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build
> >> >>> makes
> >> >>> > >> >> consistent
> >> >>> > >> >> > progress, but just does not finish on time. We can also
> see
> >> >>> several
> >> >>> > >> >> recent
> >> >>> > >> >> > successful builds with execution time very close to time
> out
> >> >>> > [9-11].
> >> >>> > >> >> >
> >> >>> > >> >> > I'd like to propose to increase time limit for Java
> >> precommit
> >> >>> test
> >> >>> > >> suite
> >> >>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
> >> >>> should
> >> >>> > >> >> > definitely try to reduce the test execution time, and
> reduce
> >> >>> > >> flakiness.
> >> >>> > >> >> > However we need the tests at least pass for now. If we
> write
> >> >>> off
> >> >>> > >> failed
> >> >>> > >> >> > test suites as 'flakes' and merge PRs without having a
> green
> >> >>> test
> >> >>> > >> >> signal,
> >> >>> > >> >> > we will have to spend more time tracing breakages such as
> >> #1.
> >> >>> > >> >> >
> >> >>> > >> >> > Thoughts?
> >> >>> > >> >> >
> >> >>> > >> >> > Thanks,
> >> >>> > >> >> > Valentyn
> >> >>> > >> >> >
> >> >>> > >> >> > [1] https://builds.apache.org/job/
> >> >>> beam_PreCommit_Java_MavenInsta
> >> >>> > ll/
> >> >>> > >> >> > [2] https://builds.apache.org/job/
> >> >>> beam_PostCommit_Java_MavenInst
> >> >>> > all/
> >> >>> > >> >> > [3] https://builds.apache.org/job/
> >> >>> beam_PostCommit_Java_MavenInst
> >> >>> > >> >> > all/5052/changes
> >> >>> > >> >> >
> >> >>> > >> >> > [4] https://github.com/apache/beam
> >> /commit/d745cc9d8cc1735d3b
> >> >>> > >> >> > c3c67ba3e2617cb7f11a8c
> >> >>> > >> >> > [5] https://github.com/apache/beam
> >> >>> /commit/0d8ab6cbbc762dd9f9be1b
> >> >>> > >> >> > 3e9a26b6c9d0bb6dc3
> >> >>> > >> >> >
> >> >>> > >> >> > [6] https://builds.apache.org/job/
> >> >>> beam_PreCommit_Java_MavenInsta
> >> >>> > >> >> ll/15222/
> >> >>> > >> >> > [7] https://builds.apache.org/job/
> >> >>> beam_PreCommit_Java_MavenInsta
> >> >>> > >> >> ll/15195/
> >> >>> > >> >> > [8] https://builds.apache.org/job/
> >> >>> beam_PreCommit_Java_MavenInsta
> >> >>> > >> >> ll/15220/
> >> >>> > >> >> >
> >> >>> > >> >> > [9] https://builds.apache.org/job/
> >> >>> beam_PreCommit_Java_MavenInsta
> >> >>> > >> >> ll/15009/
> >> >>> > >> >> > [10] https://builds.apache.org/job/
> >> >>> beam_PreCommit_Java_MavenInsta
> >> >>> > >> >> ll/15068/
> >> >>> > >> >> > [11] https://builds.apache.org/job/
> >> >>> beam_PreCommit_Java_MavenInsta
> >> >>> > >> >> ll/15016/
> >> >>> > >> >> >
> >> >>> > >> >> >
> >> >>> > >> >>
> >> >>> > >> >
> >> >>> > >> >
> >> >>> > >>
> >> >>> > >
> >> >>> > >
> >> >>> >
> >> >>>
> >> >>
> >> >>
> >> >
> >>
> >
> >
>

Re: Java pre/post commit test suite breakage

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
On Thu, Oct 26, 2017 at 9:14 AM, Kenneth Knowles <kl...@google.com> wrote:

> On Wed, Oct 25, 2017 at 6:51 PM, Robert Bradshaw <
> robertwb@google.com.invalid> wrote:
>>
>> These are just some ideas; I honestly don't know. I think my upper limit
>> for default precommit feedback is probably 30 minutes, and even there I am
>> not very happy.
>>
>>
>> We should be able to get a good signal in less than 30 minutes. Lint,
>> smoke
>> tests, etc. I think there is a more extensive set of tests that should
>> pass
>> before merging.
>>
>
> Yea, definitely. Digging into this in particular because I think it is
> quite promising as a way to reduce build congestion.
>
> It looks like `mvn clean test -Dfindbugs.skip` would finish in 10 minutes
> if we fixed it. Currently it looks like some modules don't work under this
> invocation, probably requiring some state to be established by prior bits
> run by `verify`. If we iterated on that until review was done, then 30-90
> minutes to prevent breakage would be fine, especially since a failure in
> that longer slower build is probably a signal to add a quicker repro unit
> test.
>

Looked into this, and some IOs explicitly use Guava 19, so presumably that
is important, while the core Java SDK requires Guava 21. This is shaded
away during `mvn verify` or `mvn install` but for `mvn test` there is no
packaging or shading, so when the IO tests run with Guava 19 taking
precedence, checkArguments within the Java SDK fail. This one might be
fixable, but my 5 second takeaway is that our codebase cannot build against
itself without some shading (or equivalent).

Kenn





>
>>
>> Also it might not hurt to audit our tests and see if there are a small
>> number that take a disproportionate amount of time.
>>
>> Any other suggestions?
>>
>> Kenn
>>
>> On Mon, Oct 23, 2017 at 3:29 PM, Kenneth Knowles <kl...@google.com> wrote:
>>
>> > I want to wait and get some green from Jenkins running against the HEAD
>> > groovy scripts to confirm. I haven't sat at my desk long enough to see a
>> > full `mvn -P release clean verify` yet.
>> >
>> > On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <kl...@google.com>
>> wrote:
>> >
>> >> I was easily able to reproduce all the sorts of failures (and some
>> more!)
>> >>
>> >> Here are some things that now work that didn't work, or didn't work
>> >> correctly, before
>> >>
>> >>  - mvn apache-rat:check
>> >>  - mvn -P release apache-rat:check
>> >>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
>> >>
>> >> It turns out we had these issues:
>> >>
>> >>  - items in .gitignore that our RAT config did not ignore
>> >>  - sub-modules actually *did* inherit the RAT config, depending on the
>> >> command you run
>> >>  - paths in the RAT exclude were relative to current dir
>> >>  - paths in our RAT exclude that aren't even part of our codebase or
>> >> generated targets, but just things that Jenkins dropped in the
>> workspace,
>> >> and we were cloning directly into the workspace
>> >>
>> >> I think with the new Jenkins config and the new pom.xml things should
>> >> work well.
>> >>
>> >> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do
>> not
>> >> take effect from being merged but only when the seed job next runs.
>> (TODO:
>> >> fix this). I am still trying to get the relevant Jenkins UI pages to
>> load
>> >> to get your change incorporated into the live scripts, which were
>> regressed
>> >> by being run from an old PR (TODO: fix the fact that this can happen).
>> >>
>> >> Kenn
>> >>
>> >> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
>> >> valentyn@google.com.invalid> wrote:
>> >>
>> >>> Thanks a lot!
>> >>>
>> >>> Kenn, Were you able to reproduce RAT failures and test the fix
>> locally?
>> I
>> >>> think "mvn clean verify -Prelease" still passes for me.
>> >>>
>> >>> Timeout for the test suite has been increased in
>> >>> https://github.com/apache/
>> >>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
>> >>>
>> >>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles
>> <klk@google.com.invalid
>> >>> >
>> >>> wrote:
>> >>>
>> >>> > Wrong link - https://github.com/apache/beam/pull/4027
>> >>> >
>> >>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com>
>> >>> wrote:
>> >>> >
>> >>> > > Yea, root cause is the config bug I described. Proposed fix at
>> >>> > > https://github.com/apache/beam/pull/4019/files. I'm working with
>> >>> infra
>> >>> > to
>> >>> > > sort out other build issues that are probably not related.
>> >>> > >
>> >>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik
>> >>> <lc...@google.com.invalid>
>> >>> > > wrote:
>> >>> > >
>> >>> > >> The build breakage I outlined is being tracked in
>> >>> > >> https://issues.apache.org/jira/browse/BEAM-3092
>> >>> > >>
>> >>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com>
>> >>> wrote:
>> >>> > >>
>> >>> > >> > Another breaking change was caused by
>> https://github.com/apache/
>> >>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because
>> it
>> >>> > changed
>> >>> > >> > the build layout on the Jenkins server and our RAT rules
>> expected
>> >>> to
>> >>> > >> apply
>> >>> > >> > from a root directory. I pinged Kenneth Knowles about it and he
>> >>> said
>> >>> > he
>> >>> > >> was
>> >>> > >> > taking a look.
>> >>> > >> >
>> >>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
>> >>> > >> <rangadi@google.com.invalid
>> >>> > >> > > wrote:
>> >>> > >> >
>> >>> > >> >> Regd (1) :
>> >>> > >> >>
>> >>> > >> >> [4] did have have a file without Apache Licence. It was fixed
>> the
>> >>> > next
>> >>> > >> >> day (
>> >>> > >> >> commit
>> >>> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
>> >>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
>> >>> > >> >> thanks to Ken Knowles who pinged me about it.
>> >>> > >> >>
>> >>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
>> >>> > >> >> valentyn@google.com>
>> >>> > >> >> wrote:
>> >>> > >> >>
>> >>> > >> >> > Hi Beam-Dev,
>> >>> > >> >> >
>> >>> > >> >> > It's been >5 days since the last successful run of a
>> >>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days
>> since
>> >>> last
>> >>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
>> >>> > >> >> >
>> >>> > >> >> > Looking at build logs I see following problems.
>> >>> > >> >> >
>> >>> > >> >> > 1. After October 17, postcommit builds started to fail with
>> >>> > >> >> >
>> >>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
>> >>> in:0.12:check
>> >>> > >> >> > (default) on project beam-parent: Too many files with
>> >>> unapproved
>> >>> > >> >> license: 1
>> >>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
>> >>> > >> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
>> >>> > >> >> > t-2.3.0-SNAPSHOT.rat
>> >>> > >> >> >
>> >>> > >> >> > The earliest build that I see this error is Postcommit #5052
>> >>> [3].
>> >>> > >> >> >
>> >>> > >> >> > This makes me suspect [4] or [5] as a breaking change, since
>> >>> they
>> >>> > >> change
>> >>> > >> >> > pom files.
>> >>> > >> >> >
>> >>> > >> >> > Questions:
>> >>> > >> >> > - Is there a way we can reproduce this failure locally? mvn
>> >>> clean
>> >>> > >> verify
>> >>> > >> >> > passes locally for me.
>> >>> > >> >> > - Is there a way we can see the See RAT report mentioned in
>> the
>> >>> > error
>> >>> > >> >> > log?
>> >>> > >> >> >
>> >>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer
>> >>> complete
>> >>> > >> within
>> >>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build
>> >>> makes
>> >>> > >> >> consistent
>> >>> > >> >> > progress, but just does not finish on time. We can also see
>> >>> several
>> >>> > >> >> recent
>> >>> > >> >> > successful builds with execution time very close to time out
>> >>> > [9-11].
>> >>> > >> >> >
>> >>> > >> >> > I'd like to propose to increase time limit for Java
>> precommit
>> >>> test
>> >>> > >> suite
>> >>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
>> >>> should
>> >>> > >> >> > definitely try to reduce the test execution time, and reduce
>> >>> > >> flakiness.
>> >>> > >> >> > However we need the tests at least pass for now. If we write
>> >>> off
>> >>> > >> failed
>> >>> > >> >> > test suites as 'flakes' and merge PRs without having a green
>> >>> test
>> >>> > >> >> signal,
>> >>> > >> >> > we will have to spend more time tracing breakages such as
>> #1.
>> >>> > >> >> >
>> >>> > >> >> > Thoughts?
>> >>> > >> >> >
>> >>> > >> >> > Thanks,
>> >>> > >> >> > Valentyn
>> >>> > >> >> >
>> >>> > >> >> > [1] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > ll/
>> >>> > >> >> > [2] https://builds.apache.org/job/
>> >>> beam_PostCommit_Java_MavenInst
>> >>> > all/
>> >>> > >> >> > [3] https://builds.apache.org/job/
>> >>> beam_PostCommit_Java_MavenInst
>> >>> > >> >> > all/5052/changes
>> >>> > >> >> >
>> >>> > >> >> > [4] https://github.com/apache/beam
>> /commit/d745cc9d8cc1735d3b
>> >>> > >> >> > c3c67ba3e2617cb7f11a8c
>> >>> > >> >> > [5] https://github.com/apache/beam
>> >>> /commit/0d8ab6cbbc762dd9f9be1b
>> >>> > >> >> > 3e9a26b6c9d0bb6dc3
>> >>> > >> >> >
>> >>> > >> >> > [6] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15222/
>> >>> > >> >> > [7] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15195/
>> >>> > >> >> > [8] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15220/
>> >>> > >> >> >
>> >>> > >> >> > [9] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15009/
>> >>> > >> >> > [10] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15068/
>> >>> > >> >> > [11] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15016/
>> >>> > >> >> >
>> >>> > >> >> >
>> >>> > >> >>
>> >>> > >> >
>> >>> > >> >
>> >>> > >>
>> >>> > >
>> >>> > >
>> >>> >
>> >>>
>> >>
>> >>
>> >
>>
>
>

Re: Java pre/post commit test suite breakage

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
On Wed, Oct 25, 2017 at 6:51 PM, Robert Bradshaw <
robertwb@google.com.invalid> wrote:
>
> These are just some ideas; I honestly don't know. I think my upper limit
> for default precommit feedback is probably 30 minutes, and even there I am
> not very happy.
>
>
> We should be able to get a good signal in less than 30 minutes. Lint, smoke
> tests, etc. I think there is a more extensive set of tests that should pass
> before merging.
>

Yea, definitely. Digging into this in particular because I think it is
quite promising as a way to reduce build congestion.

It looks like `mvn clean test -Dfindbugs.skip` would finish in 10 minutes
if we fixed it. Currently it looks like some modules don't work under this
invocation, probably requiring some state to be established by prior bits
run by `verify`. If we iterated on that until review was done, then 30-90
minutes to prevent breakage would be fine, especially since a failure in
that longer slower build is probably a signal to add a quicker repro unit
test.

Kenn


>
> Also it might not hurt to audit our tests and see if there are a small
> number that take a disproportionate amount of time.
>
> Any other suggestions?
>
> Kenn
>
> On Mon, Oct 23, 2017 at 3:29 PM, Kenneth Knowles <kl...@google.com> wrote:
>
> > I want to wait and get some green from Jenkins running against the HEAD
> > groovy scripts to confirm. I haven't sat at my desk long enough to see a
> > full `mvn -P release clean verify` yet.
> >
> > On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <kl...@google.com> wrote:
> >
> >> I was easily able to reproduce all the sorts of failures (and some
> more!)
> >>
> >> Here are some things that now work that didn't work, or didn't work
> >> correctly, before
> >>
> >>  - mvn apache-rat:check
> >>  - mvn -P release apache-rat:check
> >>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
> >>
> >> It turns out we had these issues:
> >>
> >>  - items in .gitignore that our RAT config did not ignore
> >>  - sub-modules actually *did* inherit the RAT config, depending on the
> >> command you run
> >>  - paths in the RAT exclude were relative to current dir
> >>  - paths in our RAT exclude that aren't even part of our codebase or
> >> generated targets, but just things that Jenkins dropped in the
> workspace,
> >> and we were cloning directly into the workspace
> >>
> >> I think with the new Jenkins config and the new pom.xml things should
> >> work well.
> >>
> >> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do not
> >> take effect from being merged but only when the seed job next runs.
> (TODO:
> >> fix this). I am still trying to get the relevant Jenkins UI pages to
> load
> >> to get your change incorporated into the live scripts, which were
> regressed
> >> by being run from an old PR (TODO: fix the fact that this can happen).
> >>
> >> Kenn
> >>
> >> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
> >> valentyn@google.com.invalid> wrote:
> >>
> >>> Thanks a lot!
> >>>
> >>> Kenn, Were you able to reproduce RAT failures and test the fix locally?
> I
> >>> think "mvn clean verify -Prelease" still passes for me.
> >>>
> >>> Timeout for the test suite has been increased in
> >>> https://github.com/apache/
> >>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
> >>>
> >>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles
> <klk@google.com.invalid
> >>> >
> >>> wrote:
> >>>
> >>> > Wrong link - https://github.com/apache/beam/pull/4027
> >>> >
> >>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com>
> >>> wrote:
> >>> >
> >>> > > Yea, root cause is the config bug I described. Proposed fix at
> >>> > > https://github.com/apache/beam/pull/4019/files. I'm working with
> >>> infra
> >>> > to
> >>> > > sort out other build issues that are probably not related.
> >>> > >
> >>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik
> >>> <lc...@google.com.invalid>
> >>> > > wrote:
> >>> > >
> >>> > >> The build breakage I outlined is being tracked in
> >>> > >> https://issues.apache.org/jira/browse/BEAM-3092
> >>> > >>
> >>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com>
> >>> wrote:
> >>> > >>
> >>> > >> > Another breaking change was caused by
> https://github.com/apache/
> >>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it
> >>> > changed
> >>> > >> > the build layout on the Jenkins server and our RAT rules
> expected
> >>> to
> >>> > >> apply
> >>> > >> > from a root directory. I pinged Kenneth Knowles about it and he
> >>> said
> >>> > he
> >>> > >> was
> >>> > >> > taking a look.
> >>> > >> >
> >>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
> >>> > >> <rangadi@google.com.invalid
> >>> > >> > > wrote:
> >>> > >> >
> >>> > >> >> Regd (1) :
> >>> > >> >>
> >>> > >> >> [4] did have have a file without Apache Licence. It was fixed
> the
> >>> > next
> >>> > >> >> day (
> >>> > >> >> commit
> >>> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
> >>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
> >>> > >> >> thanks to Ken Knowles who pinged me about it.
> >>> > >> >>
> >>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
> >>> > >> >> valentyn@google.com>
> >>> > >> >> wrote:
> >>> > >> >>
> >>> > >> >> > Hi Beam-Dev,
> >>> > >> >> >
> >>> > >> >> > It's been >5 days since the last successful run of a
> >>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since
> >>> last
> >>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
> >>> > >> >> >
> >>> > >> >> > Looking at build logs I see following problems.
> >>> > >> >> >
> >>> > >> >> > 1. After October 17, postcommit builds started to fail with
> >>> > >> >> >
> >>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
> >>> in:0.12:check
> >>> > >> >> > (default) on project beam-parent: Too many files with
> >>> unapproved
> >>> > >> >> license: 1
> >>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
> >>> > >> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
> >>> > >> >> > t-2.3.0-SNAPSHOT.rat
> >>> > >> >> >
> >>> > >> >> > The earliest build that I see this error is Postcommit #5052
> >>> [3].
> >>> > >> >> >
> >>> > >> >> > This makes me suspect [4] or [5] as a breaking change, since
> >>> they
> >>> > >> change
> >>> > >> >> > pom files.
> >>> > >> >> >
> >>> > >> >> > Questions:
> >>> > >> >> > - Is there a way we can reproduce this failure locally? mvn
> >>> clean
> >>> > >> verify
> >>> > >> >> > passes locally for me.
> >>> > >> >> > - Is there a way we can see the See RAT report mentioned in
> the
> >>> > error
> >>> > >> >> > log?
> >>> > >> >> >
> >>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer
> >>> complete
> >>> > >> within
> >>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build
> >>> makes
> >>> > >> >> consistent
> >>> > >> >> > progress, but just does not finish on time. We can also see
> >>> several
> >>> > >> >> recent
> >>> > >> >> > successful builds with execution time very close to time out
> >>> > [9-11].
> >>> > >> >> >
> >>> > >> >> > I'd like to propose to increase time limit for Java precommit
> >>> test
> >>> > >> suite
> >>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
> >>> should
> >>> > >> >> > definitely try to reduce the test execution time, and reduce
> >>> > >> flakiness.
> >>> > >> >> > However we need the tests at least pass for now. If we write
> >>> off
> >>> > >> failed
> >>> > >> >> > test suites as 'flakes' and merge PRs without having a green
> >>> test
> >>> > >> >> signal,
> >>> > >> >> > we will have to spend more time tracing breakages such as #1.
> >>> > >> >> >
> >>> > >> >> > Thoughts?
> >>> > >> >> >
> >>> > >> >> > Thanks,
> >>> > >> >> > Valentyn
> >>> > >> >> >
> >>> > >> >> > [1] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > ll/
> >>> > >> >> > [2] https://builds.apache.org/job/
> >>> beam_PostCommit_Java_MavenInst
> >>> > all/
> >>> > >> >> > [3] https://builds.apache.org/job/
> >>> beam_PostCommit_Java_MavenInst
> >>> > >> >> > all/5052/changes
> >>> > >> >> >
> >>> > >> >> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
> >>> > >> >> > c3c67ba3e2617cb7f11a8c
> >>> > >> >> > [5] https://github.com/apache/beam
> >>> /commit/0d8ab6cbbc762dd9f9be1b
> >>> > >> >> > 3e9a26b6c9d0bb6dc3
> >>> > >> >> >
> >>> > >> >> > [6] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15222/
> >>> > >> >> > [7] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15195/
> >>> > >> >> > [8] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15220/
> >>> > >> >> >
> >>> > >> >> > [9] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15009/
> >>> > >> >> > [10] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15068/
> >>> > >> >> > [11] https://builds.apache.org/job/
> >>> beam_PreCommit_Java_MavenInsta
> >>> > >> >> ll/15016/
> >>> > >> >> >
> >>> > >> >> >
> >>> > >> >>
> >>> > >> >
> >>> > >> >
> >>> > >>
> >>> > >
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>

Re: Java pre/post commit test suite breakage

Posted by Robert Bradshaw <ro...@google.com.INVALID>.
On Oct 25, 2017 1:04 PM, "Kenneth Knowles" <kl...@google.com.invalid> wrote:

Hi all,

I wanted to circle back on this. We've had continual quasi-outages, where
Jenkins occasionally schedules jobs but not reliably. Meanwhile, many of
our most important jobs continued to hit their timeouts when they did get
scheduled. On top of that, I've run the wrong command once before merging a
PR, resulting in a couple broken precommit runs.

Just so everyone knows the actions that have been taken and what we still
need to do:

 - Java postcommit and precommit timeouts bumped to 240 minutes (!!)
 - Seed job made independent of other jobs, so we can experiment safely
 - RAT config fixed

The latency of precommit is still prohibitive for effective use. Not only
are we waiting too long, but the number of workers needed to avoid backlog
is excessive. Here are some ideas for things we can do:

 - Proceed further on the pipeline job, which is what we want to do long
term


+1

 - Precommit: run only on demand, so workers are not congested by automatic
builds; we could also pick and choose what we want to run


I find automatically running is generally helpful, but at the very least we
could cancel redundant runs (e.g. when more commits are pushed).

 - Consider non-Maven build orchestrators that can do dependency-driven
builds


Huge +1. We don't need to waste 240 minutes on Java tests when nothing but
Python changed, and vice-versa. Likewise many subcomponents could be tested
without retesting their bases and siblings.

 - Precommit: run fewer tests; I think this would mean leaving out some
modules that we are OK finding issues in the postcommit, or making them on
demand only


-1. In my experience postcommit failures are much harder to debug and often
mask other breakage (plus impacting everyone's development). This is a
choice of last resort to me; seems we have a lot of other options that
don't reduce coverage.

 - ValidatesRunner tests: Instead of running a single mavenJob, run
multiple, once which just installs while skipping everything else, then
follow it with just running the tests we care about


+1.

 - Examples integration tests: separate from the maven precommit; also run
as sequenced invocations

These are just some ideas; I honestly don't know. I think my upper limit
for default precommit feedback is probably 30 minutes, and even there I am
not very happy.


We should be able to get a good signal in less than 30 minutes. Lint, smoke
tests, etc. I think there is a more extensive set of tests that should pass
before merging.

Also it might not hurt to audit our tests and see if there are a small
number that take a disproportionate amount of time.

Any other suggestions?

Kenn

On Mon, Oct 23, 2017 at 3:29 PM, Kenneth Knowles <kl...@google.com> wrote:

> I want to wait and get some green from Jenkins running against the HEAD
> groovy scripts to confirm. I haven't sat at my desk long enough to see a
> full `mvn -P release clean verify` yet.
>
> On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <kl...@google.com> wrote:
>
>> I was easily able to reproduce all the sorts of failures (and some more!)
>>
>> Here are some things that now work that didn't work, or didn't work
>> correctly, before
>>
>>  - mvn apache-rat:check
>>  - mvn -P release apache-rat:check
>>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
>>
>> It turns out we had these issues:
>>
>>  - items in .gitignore that our RAT config did not ignore
>>  - sub-modules actually *did* inherit the RAT config, depending on the
>> command you run
>>  - paths in the RAT exclude were relative to current dir
>>  - paths in our RAT exclude that aren't even part of our codebase or
>> generated targets, but just things that Jenkins dropped in the workspace,
>> and we were cloning directly into the workspace
>>
>> I think with the new Jenkins config and the new pom.xml things should
>> work well.
>>
>> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do not
>> take effect from being merged but only when the seed job next runs.
(TODO:
>> fix this). I am still trying to get the relevant Jenkins UI pages to load
>> to get your change incorporated into the live scripts, which were
regressed
>> by being run from an old PR (TODO: fix the fact that this can happen).
>>
>> Kenn
>>
>> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
>> valentyn@google.com.invalid> wrote:
>>
>>> Thanks a lot!
>>>
>>> Kenn, Were you able to reproduce RAT failures and test the fix locally?
I
>>> think "mvn clean verify -Prelease" still passes for me.
>>>
>>> Timeout for the test suite has been increased in
>>> https://github.com/apache/
>>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
>>>
>>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <klk@google.com.invalid
>>> >
>>> wrote:
>>>
>>> > Wrong link - https://github.com/apache/beam/pull/4027
>>> >
>>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com>
>>> wrote:
>>> >
>>> > > Yea, root cause is the config bug I described. Proposed fix at
>>> > > https://github.com/apache/beam/pull/4019/files. I'm working with
>>> infra
>>> > to
>>> > > sort out other build issues that are probably not related.
>>> > >
>>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik
>>> <lc...@google.com.invalid>
>>> > > wrote:
>>> > >
>>> > >> The build breakage I outlined is being tracked in
>>> > >> https://issues.apache.org/jira/browse/BEAM-3092
>>> > >>
>>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com>
>>> wrote:
>>> > >>
>>> > >> > Another breaking change was caused by https://github.com/apache/
>>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it
>>> > changed
>>> > >> > the build layout on the Jenkins server and our RAT rules expected
>>> to
>>> > >> apply
>>> > >> > from a root directory. I pinged Kenneth Knowles about it and he
>>> said
>>> > he
>>> > >> was
>>> > >> > taking a look.
>>> > >> >
>>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
>>> > >> <rangadi@google.com.invalid
>>> > >> > > wrote:
>>> > >> >
>>> > >> >> Regd (1) :
>>> > >> >>
>>> > >> >> [4] did have have a file without Apache Licence. It was fixed
the
>>> > next
>>> > >> >> day (
>>> > >> >> commit
>>> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
>>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
>>> > >> >> thanks to Ken Knowles who pinged me about it.
>>> > >> >>
>>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
>>> > >> >> valentyn@google.com>
>>> > >> >> wrote:
>>> > >> >>
>>> > >> >> > Hi Beam-Dev,
>>> > >> >> >
>>> > >> >> > It's been >5 days since the last successful run of a
>>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since
>>> last
>>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
>>> > >> >> >
>>> > >> >> > Looking at build logs I see following problems.
>>> > >> >> >
>>> > >> >> > 1. After October 17, postcommit builds started to fail with
>>> > >> >> >
>>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
>>> in:0.12:check
>>> > >> >> > (default) on project beam-parent: Too many files with
>>> unapproved
>>> > >> >> license: 1
>>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
>>> > >> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
>>> > >> >> > t-2.3.0-SNAPSHOT.rat
>>> > >> >> >
>>> > >> >> > The earliest build that I see this error is Postcommit #5052
>>> [3].
>>> > >> >> >
>>> > >> >> > This makes me suspect [4] or [5] as a breaking change, since
>>> they
>>> > >> change
>>> > >> >> > pom files.
>>> > >> >> >
>>> > >> >> > Questions:
>>> > >> >> > - Is there a way we can reproduce this failure locally? mvn
>>> clean
>>> > >> verify
>>> > >> >> > passes locally for me.
>>> > >> >> > - Is there a way we can see the See RAT report mentioned in
the
>>> > error
>>> > >> >> > log?
>>> > >> >> >
>>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer
>>> complete
>>> > >> within
>>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build
>>> makes
>>> > >> >> consistent
>>> > >> >> > progress, but just does not finish on time. We can also see
>>> several
>>> > >> >> recent
>>> > >> >> > successful builds with execution time very close to time out
>>> > [9-11].
>>> > >> >> >
>>> > >> >> > I'd like to propose to increase time limit for Java precommit
>>> test
>>> > >> suite
>>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
>>> should
>>> > >> >> > definitely try to reduce the test execution time, and reduce
>>> > >> flakiness.
>>> > >> >> > However we need the tests at least pass for now. If we write
>>> off
>>> > >> failed
>>> > >> >> > test suites as 'flakes' and merge PRs without having a green
>>> test
>>> > >> >> signal,
>>> > >> >> > we will have to spend more time tracing breakages such as #1.
>>> > >> >> >
>>> > >> >> > Thoughts?
>>> > >> >> >
>>> > >> >> > Thanks,
>>> > >> >> > Valentyn
>>> > >> >> >
>>> > >> >> > [1] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > ll/
>>> > >> >> > [2] https://builds.apache.org/job/
>>> beam_PostCommit_Java_MavenInst
>>> > all/
>>> > >> >> > [3] https://builds.apache.org/job/
>>> beam_PostCommit_Java_MavenInst
>>> > >> >> > all/5052/changes
>>> > >> >> >
>>> > >> >> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
>>> > >> >> > c3c67ba3e2617cb7f11a8c
>>> > >> >> > [5] https://github.com/apache/beam
>>> /commit/0d8ab6cbbc762dd9f9be1b
>>> > >> >> > 3e9a26b6c9d0bb6dc3
>>> > >> >> >
>>> > >> >> > [6] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15222/
>>> > >> >> > [7] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15195/
>>> > >> >> > [8] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15220/
>>> > >> >> >
>>> > >> >> > [9] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15009/
>>> > >> >> > [10] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15068/
>>> > >> >> > [11] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15016/
>>> > >> >> >
>>> > >> >> >
>>> > >> >>
>>> > >> >
>>> > >> >
>>> > >>
>>> > >
>>> > >
>>> >
>>>
>>
>>
>

Re: Java pre/post commit test suite breakage

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
Hi all,

I wanted to circle back on this. We've had continual quasi-outages, where
Jenkins occasionally schedules jobs but not reliably. Meanwhile, many of
our most important jobs continued to hit their timeouts when they did get
scheduled. On top of that, I've run the wrong command once before merging a
PR, resulting in a couple broken precommit runs.

Just so everyone knows the actions that have been taken and what we still
need to do:

 - Java postcommit and precommit timeouts bumped to 240 minutes (!!)
 - Seed job made independent of other jobs, so we can experiment safely
 - RAT config fixed

The latency of precommit is still prohibitive for effective use. Not only
are we waiting too long, but the number of workers needed to avoid backlog
is excessive. Here are some ideas for things we can do:

 - Proceed further on the pipeline job, which is what we want to do long
term
 - Precommit: run only on demand, so workers are not congested by automatic
builds; we could also pick and choose what we want to run
 - Consider non-Maven build orchestrators that can do dependency-driven
builds
 - Precommit: run fewer tests; I think this would mean leaving out some
modules that we are OK finding issues in the postcommit, or making them on
demand only
 - ValidatesRunner tests: Instead of running a single mavenJob, run
multiple, once which just installs while skipping everything else, then
follow it with just running the tests we care about
 - Examples integration tests: separate from the maven precommit; also run
as sequenced invocations

These are just some ideas; I honestly don't know. I think my upper limit
for default precommit feedback is probably 30 minutes, and even there I am
not very happy.

Any other suggestions?

Kenn

On Mon, Oct 23, 2017 at 3:29 PM, Kenneth Knowles <kl...@google.com> wrote:

> I want to wait and get some green from Jenkins running against the HEAD
> groovy scripts to confirm. I haven't sat at my desk long enough to see a
> full `mvn -P release clean verify` yet.
>
> On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <kl...@google.com> wrote:
>
>> I was easily able to reproduce all the sorts of failures (and some more!)
>>
>> Here are some things that now work that didn't work, or didn't work
>> correctly, before
>>
>>  - mvn apache-rat:check
>>  - mvn -P release apache-rat:check
>>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
>>
>> It turns out we had these issues:
>>
>>  - items in .gitignore that our RAT config did not ignore
>>  - sub-modules actually *did* inherit the RAT config, depending on the
>> command you run
>>  - paths in the RAT exclude were relative to current dir
>>  - paths in our RAT exclude that aren't even part of our codebase or
>> generated targets, but just things that Jenkins dropped in the workspace,
>> and we were cloning directly into the workspace
>>
>> I think with the new Jenkins config and the new pom.xml things should
>> work well.
>>
>> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do not
>> take effect from being merged but only when the seed job next runs. (TODO:
>> fix this). I am still trying to get the relevant Jenkins UI pages to load
>> to get your change incorporated into the live scripts, which were regressed
>> by being run from an old PR (TODO: fix the fact that this can happen).
>>
>> Kenn
>>
>> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
>> valentyn@google.com.invalid> wrote:
>>
>>> Thanks a lot!
>>>
>>> Kenn, Were you able to reproduce RAT failures and test the fix locally? I
>>> think "mvn clean verify -Prelease" still passes for me.
>>>
>>> Timeout for the test suite has been increased in
>>> https://github.com/apache/
>>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
>>>
>>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <klk@google.com.invalid
>>> >
>>> wrote:
>>>
>>> > Wrong link - https://github.com/apache/beam/pull/4027
>>> >
>>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com>
>>> wrote:
>>> >
>>> > > Yea, root cause is the config bug I described. Proposed fix at
>>> > > https://github.com/apache/beam/pull/4019/files. I'm working with
>>> infra
>>> > to
>>> > > sort out other build issues that are probably not related.
>>> > >
>>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik
>>> <lc...@google.com.invalid>
>>> > > wrote:
>>> > >
>>> > >> The build breakage I outlined is being tracked in
>>> > >> https://issues.apache.org/jira/browse/BEAM-3092
>>> > >>
>>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com>
>>> wrote:
>>> > >>
>>> > >> > Another breaking change was caused by https://github.com/apache/
>>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it
>>> > changed
>>> > >> > the build layout on the Jenkins server and our RAT rules expected
>>> to
>>> > >> apply
>>> > >> > from a root directory. I pinged Kenneth Knowles about it and he
>>> said
>>> > he
>>> > >> was
>>> > >> > taking a look.
>>> > >> >
>>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
>>> > >> <rangadi@google.com.invalid
>>> > >> > > wrote:
>>> > >> >
>>> > >> >> Regd (1) :
>>> > >> >>
>>> > >> >> [4] did have have a file without Apache Licence. It was fixed the
>>> > next
>>> > >> >> day (
>>> > >> >> commit
>>> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
>>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
>>> > >> >> thanks to Ken Knowles who pinged me about it.
>>> > >> >>
>>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
>>> > >> >> valentyn@google.com>
>>> > >> >> wrote:
>>> > >> >>
>>> > >> >> > Hi Beam-Dev,
>>> > >> >> >
>>> > >> >> > It's been >5 days since the last successful run of a
>>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since
>>> last
>>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
>>> > >> >> >
>>> > >> >> > Looking at build logs I see following problems.
>>> > >> >> >
>>> > >> >> > 1. After October 17, postcommit builds started to fail with
>>> > >> >> >
>>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
>>> in:0.12:check
>>> > >> >> > (default) on project beam-parent: Too many files with
>>> unapproved
>>> > >> >> license: 1
>>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
>>> > >> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
>>> > >> >> > t-2.3.0-SNAPSHOT.rat
>>> > >> >> >
>>> > >> >> > The earliest build that I see this error is Postcommit #5052
>>> [3].
>>> > >> >> >
>>> > >> >> > This makes me suspect [4] or [5] as a breaking change, since
>>> they
>>> > >> change
>>> > >> >> > pom files.
>>> > >> >> >
>>> > >> >> > Questions:
>>> > >> >> > - Is there a way we can reproduce this failure locally? mvn
>>> clean
>>> > >> verify
>>> > >> >> > passes locally for me.
>>> > >> >> > - Is there a way we can see the See RAT report mentioned in the
>>> > error
>>> > >> >> > log?
>>> > >> >> >
>>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer
>>> complete
>>> > >> within
>>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build
>>> makes
>>> > >> >> consistent
>>> > >> >> > progress, but just does not finish on time. We can also see
>>> several
>>> > >> >> recent
>>> > >> >> > successful builds with execution time very close to time out
>>> > [9-11].
>>> > >> >> >
>>> > >> >> > I'd like to propose to increase time limit for Java precommit
>>> test
>>> > >> suite
>>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
>>> should
>>> > >> >> > definitely try to reduce the test execution time, and reduce
>>> > >> flakiness.
>>> > >> >> > However we need the tests at least pass for now. If we write
>>> off
>>> > >> failed
>>> > >> >> > test suites as 'flakes' and merge PRs without having a green
>>> test
>>> > >> >> signal,
>>> > >> >> > we will have to spend more time tracing breakages such as #1.
>>> > >> >> >
>>> > >> >> > Thoughts?
>>> > >> >> >
>>> > >> >> > Thanks,
>>> > >> >> > Valentyn
>>> > >> >> >
>>> > >> >> > [1] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > ll/
>>> > >> >> > [2] https://builds.apache.org/job/
>>> beam_PostCommit_Java_MavenInst
>>> > all/
>>> > >> >> > [3] https://builds.apache.org/job/
>>> beam_PostCommit_Java_MavenInst
>>> > >> >> > all/5052/changes
>>> > >> >> >
>>> > >> >> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
>>> > >> >> > c3c67ba3e2617cb7f11a8c
>>> > >> >> > [5] https://github.com/apache/beam
>>> /commit/0d8ab6cbbc762dd9f9be1b
>>> > >> >> > 3e9a26b6c9d0bb6dc3
>>> > >> >> >
>>> > >> >> > [6] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15222/
>>> > >> >> > [7] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15195/
>>> > >> >> > [8] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15220/
>>> > >> >> >
>>> > >> >> > [9] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15009/
>>> > >> >> > [10] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15068/
>>> > >> >> > [11] https://builds.apache.org/job/
>>> beam_PreCommit_Java_MavenInsta
>>> > >> >> ll/15016/
>>> > >> >> >
>>> > >> >> >
>>> > >> >>
>>> > >> >
>>> > >> >
>>> > >>
>>> > >
>>> > >
>>> >
>>>
>>
>>
>

Re: Java pre/post commit test suite breakage

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
I want to wait and get some green from Jenkins running against the HEAD
groovy scripts to confirm. I haven't sat at my desk long enough to see a
full `mvn -P release clean verify` yet.

On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <kl...@google.com> wrote:

> I was easily able to reproduce all the sorts of failures (and some more!)
>
> Here are some things that now work that didn't work, or didn't work
> correctly, before
>
>  - mvn apache-rat:check
>  - mvn -P release apache-rat:check
>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
>
> It turns out we had these issues:
>
>  - items in .gitignore that our RAT config did not ignore
>  - sub-modules actually *did* inherit the RAT config, depending on the
> command you run
>  - paths in the RAT exclude were relative to current dir
>  - paths in our RAT exclude that aren't even part of our codebase or
> generated targets, but just things that Jenkins dropped in the workspace,
> and we were cloning directly into the workspace
>
> I think with the new Jenkins config and the new pom.xml things should work
> well.
>
> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do not
> take effect from being merged but only when the seed job next runs. (TODO:
> fix this). I am still trying to get the relevant Jenkins UI pages to load
> to get your change incorporated into the live scripts, which were regressed
> by being run from an old PR (TODO: fix the fact that this can happen).
>
> Kenn
>
> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
> valentyn@google.com.invalid> wrote:
>
>> Thanks a lot!
>>
>> Kenn, Were you able to reproduce RAT failures and test the fix locally? I
>> think "mvn clean verify -Prelease" still passes for me.
>>
>> Timeout for the test suite has been increased in
>> https://github.com/apache/
>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
>>
>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com.invalid>
>> wrote:
>>
>> > Wrong link - https://github.com/apache/beam/pull/4027
>> >
>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com>
>> wrote:
>> >
>> > > Yea, root cause is the config bug I described. Proposed fix at
>> > > https://github.com/apache/beam/pull/4019/files. I'm working with
>> infra
>> > to
>> > > sort out other build issues that are probably not related.
>> > >
>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik <lcwik@google.com.invalid
>> >
>> > > wrote:
>> > >
>> > >> The build breakage I outlined is being tracked in
>> > >> https://issues.apache.org/jira/browse/BEAM-3092
>> > >>
>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com>
>> wrote:
>> > >>
>> > >> > Another breaking change was caused by https://github.com/apache/
>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it
>> > changed
>> > >> > the build layout on the Jenkins server and our RAT rules expected
>> to
>> > >> apply
>> > >> > from a root directory. I pinged Kenneth Knowles about it and he
>> said
>> > he
>> > >> was
>> > >> > taking a look.
>> > >> >
>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
>> > >> <rangadi@google.com.invalid
>> > >> > > wrote:
>> > >> >
>> > >> >> Regd (1) :
>> > >> >>
>> > >> >> [4] did have have a file without Apache Licence. It was fixed the
>> > next
>> > >> >> day (
>> > >> >> commit
>> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
>> > >> >> thanks to Ken Knowles who pinged me about it.
>> > >> >>
>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
>> > >> >> valentyn@google.com>
>> > >> >> wrote:
>> > >> >>
>> > >> >> > Hi Beam-Dev,
>> > >> >> >
>> > >> >> > It's been >5 days since the last successful run of a
>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since
>> last
>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
>> > >> >> >
>> > >> >> > Looking at build logs I see following problems.
>> > >> >> >
>> > >> >> > 1. After October 17, postcommit builds started to fail with
>> > >> >> >
>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
>> in:0.12:check
>> > >> >> > (default) on project beam-parent: Too many files with unapproved
>> > >> >> license: 1
>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
>> > >> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
>> > >> >> > t-2.3.0-SNAPSHOT.rat
>> > >> >> >
>> > >> >> > The earliest build that I see this error is Postcommit #5052
>> [3].
>> > >> >> >
>> > >> >> > This makes me suspect [4] or [5] as a breaking change, since
>> they
>> > >> change
>> > >> >> > pom files.
>> > >> >> >
>> > >> >> > Questions:
>> > >> >> > - Is there a way we can reproduce this failure locally? mvn
>> clean
>> > >> verify
>> > >> >> > passes locally for me.
>> > >> >> > - Is there a way we can see the See RAT report mentioned in the
>> > error
>> > >> >> > log?
>> > >> >> >
>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer complete
>> > >> within
>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build makes
>> > >> >> consistent
>> > >> >> > progress, but just does not finish on time. We can also see
>> several
>> > >> >> recent
>> > >> >> > successful builds with execution time very close to time out
>> > [9-11].
>> > >> >> >
>> > >> >> > I'd like to propose to increase time limit for Java precommit
>> test
>> > >> suite
>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
>> should
>> > >> >> > definitely try to reduce the test execution time, and reduce
>> > >> flakiness.
>> > >> >> > However we need the tests at least pass for now. If we write off
>> > >> failed
>> > >> >> > test suites as 'flakes' and merge PRs without having a green
>> test
>> > >> >> signal,
>> > >> >> > we will have to spend more time tracing breakages such as #1.
>> > >> >> >
>> > >> >> > Thoughts?
>> > >> >> >
>> > >> >> > Thanks,
>> > >> >> > Valentyn
>> > >> >> >
>> > >> >> > [1] https://builds.apache.org/job/
>> beam_PreCommit_Java_MavenInsta
>> > ll/
>> > >> >> > [2] https://builds.apache.org/job/
>> beam_PostCommit_Java_MavenInst
>> > all/
>> > >> >> > [3] https://builds.apache.org/job/
>> beam_PostCommit_Java_MavenInst
>> > >> >> > all/5052/changes
>> > >> >> >
>> > >> >> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
>> > >> >> > c3c67ba3e2617cb7f11a8c
>> > >> >> > [5] https://github.com/apache/beam
>> /commit/0d8ab6cbbc762dd9f9be1b
>> > >> >> > 3e9a26b6c9d0bb6dc3
>> > >> >> >
>> > >> >> > [6] https://builds.apache.org/job/
>> beam_PreCommit_Java_MavenInsta
>> > >> >> ll/15222/
>> > >> >> > [7] https://builds.apache.org/job/
>> beam_PreCommit_Java_MavenInsta
>> > >> >> ll/15195/
>> > >> >> > [8] https://builds.apache.org/job/
>> beam_PreCommit_Java_MavenInsta
>> > >> >> ll/15220/
>> > >> >> >
>> > >> >> > [9] https://builds.apache.org/job/
>> beam_PreCommit_Java_MavenInsta
>> > >> >> ll/15009/
>> > >> >> > [10] https://builds.apache.org/job/
>> beam_PreCommit_Java_MavenInsta
>> > >> >> ll/15068/
>> > >> >> > [11] https://builds.apache.org/job/
>> beam_PreCommit_Java_MavenInsta
>> > >> >> ll/15016/
>> > >> >> >
>> > >> >> >
>> > >> >>
>> > >> >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>
>

Re: Java pre/post commit test suite breakage

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
I was easily able to reproduce all the sorts of failures (and some more!)

Here are some things that now work that didn't work, or didn't work
correctly, before

 - mvn apache-rat:check
 - mvn -P release apache-rat:check
 - mvn -P release -f somewhere/else/pom.xml apache-rat:check

It turns out we had these issues:

 - items in .gitignore that our RAT config did not ignore
 - sub-modules actually *did* inherit the RAT config, depending on the
command you run
 - paths in the RAT exclude were relative to current dir
 - paths in our RAT exclude that aren't even part of our codebase or
generated targets, but just things that Jenkins dropped in the workspace,
and we were cloning directly into the workspace

I think with the new Jenkins config and the new pom.xml things should work
well.

Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do not
take effect from being merged but only when the seed job next runs. (TODO:
fix this). I am still trying to get the relevant Jenkins UI pages to load
to get your change incorporated into the live scripts, which were regressed
by being run from an old PR (TODO: fix the fact that this can happen).

Kenn

On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
valentyn@google.com.invalid> wrote:

> Thanks a lot!
>
> Kenn, Were you able to reproduce RAT failures and test the fix locally? I
> think "mvn clean verify -Prelease" still passes for me.
>
> Timeout for the test suite has been increased in
> https://github.com/apache/
> beam/pull/4028.
>
> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com.invalid>
> wrote:
>
> > Wrong link - https://github.com/apache/beam/pull/4027
> >
> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com> wrote:
> >
> > > Yea, root cause is the config bug I described. Proposed fix at
> > > https://github.com/apache/beam/pull/4019/files. I'm working with infra
> > to
> > > sort out other build issues that are probably not related.
> > >
> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik <lcwik@google.com.invalid
> >
> > > wrote:
> > >
> > >> The build breakage I outlined is being tracked in
> > >> https://issues.apache.org/jira/browse/BEAM-3092
> > >>
> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com>
> wrote:
> > >>
> > >> > Another breaking change was caused by https://github.com/apache/
> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it
> > changed
> > >> > the build layout on the Jenkins server and our RAT rules expected to
> > >> apply
> > >> > from a root directory. I pinged Kenneth Knowles about it and he said
> > he
> > >> was
> > >> > taking a look.
> > >> >
> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
> > >> <rangadi@google.com.invalid
> > >> > > wrote:
> > >> >
> > >> >> Regd (1) :
> > >> >>
> > >> >> [4] did have have a file without Apache Licence. It was fixed the
> > next
> > >> >> day (
> > >> >> commit
> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
> > >> >> thanks to Ken Knowles who pinged me about it.
> > >> >>
> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
> > >> >> valentyn@google.com>
> > >> >> wrote:
> > >> >>
> > >> >> > Hi Beam-Dev,
> > >> >> >
> > >> >> > It's been >5 days since the last successful run of a
> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since
> last
> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
> > >> >> >
> > >> >> > Looking at build logs I see following problems.
> > >> >> >
> > >> >> > 1. After October 17, postcommit builds started to fail with
> > >> >> >
> > >> >> > Failed to execute goal org.apache.rat:apache-rat-
> plugin:0.12:check
> > >> >> > (default) on project beam-parent: Too many files with unapproved
> > >> >> license: 1
> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
> > >> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
> > >> >> > t-2.3.0-SNAPSHOT.rat
> > >> >> >
> > >> >> > The earliest build that I see this error is Postcommit #5052 [3].
> > >> >> >
> > >> >> > This makes me suspect [4] or [5] as a breaking change, since they
> > >> change
> > >> >> > pom files.
> > >> >> >
> > >> >> > Questions:
> > >> >> > - Is there a way we can reproduce this failure locally? mvn clean
> > >> verify
> > >> >> > passes locally for me.
> > >> >> > - Is there a way we can see the See RAT report mentioned in the
> > error
> > >> >> > log?
> > >> >> >
> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer complete
> > >> within
> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build makes
> > >> >> consistent
> > >> >> > progress, but just does not finish on time. We can also see
> several
> > >> >> recent
> > >> >> > successful builds with execution time very close to time out
> > [9-11].
> > >> >> >
> > >> >> > I'd like to propose to increase time limit for Java precommit
> test
> > >> suite
> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we should
> > >> >> > definitely try to reduce the test execution time, and reduce
> > >> flakiness.
> > >> >> > However we need the tests at least pass for now. If we write off
> > >> failed
> > >> >> > test suites as 'flakes' and merge PRs without having a green test
> > >> >> signal,
> > >> >> > we will have to spend more time tracing breakages such as #1.
> > >> >> >
> > >> >> > Thoughts?
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Valentyn
> > >> >> >
> > >> >> > [1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> > ll/
> > >> >> > [2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
> > all/
> > >> >> > [3] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
> > >> >> > all/5052/changes
> > >> >> >
> > >> >> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
> > >> >> > c3c67ba3e2617cb7f11a8c
> > >> >> > [5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b
> > >> >> > 3e9a26b6c9d0bb6dc3
> > >> >> >
> > >> >> > [6] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> > >> >> ll/15222/
> > >> >> > [7] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> > >> >> ll/15195/
> > >> >> > [8] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> > >> >> ll/15220/
> > >> >> >
> > >> >> > [9] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> > >> >> ll/15009/
> > >> >> > [10] https://builds.apache.org/job/
> beam_PreCommit_Java_MavenInsta
> > >> >> ll/15068/
> > >> >> > [11] https://builds.apache.org/job/
> beam_PreCommit_Java_MavenInsta
> > >> >> ll/15016/
> > >> >> >
> > >> >> >
> > >> >>
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>

Re: Java pre/post commit test suite breakage

Posted by Valentyn Tymofieiev <va...@google.com.INVALID>.
Thanks a lot!

Kenn, Were you able to reproduce RAT failures and test the fix locally? I
think "mvn clean verify -Prelease" still passes for me.

Timeout for the test suite has been increased in https://github.com/apache/
beam/pull/4028.

On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com.invalid>
wrote:

> Wrong link - https://github.com/apache/beam/pull/4027
>
> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com> wrote:
>
> > Yea, root cause is the config bug I described. Proposed fix at
> > https://github.com/apache/beam/pull/4019/files. I'm working with infra
> to
> > sort out other build issues that are probably not related.
> >
> > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik <lc...@google.com.invalid>
> > wrote:
> >
> >> The build breakage I outlined is being tracked in
> >> https://issues.apache.org/jira/browse/BEAM-3092
> >>
> >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com> wrote:
> >>
> >> > Another breaking change was caused by https://github.com/apache/
> >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it
> changed
> >> > the build layout on the Jenkins server and our RAT rules expected to
> >> apply
> >> > from a root directory. I pinged Kenneth Knowles about it and he said
> he
> >> was
> >> > taking a look.
> >> >
> >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
> >> <rangadi@google.com.invalid
> >> > > wrote:
> >> >
> >> >> Regd (1) :
> >> >>
> >> >> [4] did have have a file without Apache Licence. It was fixed the
> next
> >> >> day (
> >> >> commit
> >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
> >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
> >> >> thanks to Ken Knowles who pinged me about it.
> >> >>
> >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
> >> >> valentyn@google.com>
> >> >> wrote:
> >> >>
> >> >> > Hi Beam-Dev,
> >> >> >
> >> >> > It's been >5 days since the last successful run of a
> >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since last
> >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
> >> >> >
> >> >> > Looking at build logs I see following problems.
> >> >> >
> >> >> > 1. After October 17, postcommit builds started to fail with
> >> >> >
> >> >> > Failed to execute goal org.apache.rat:apache-rat-plugin:0.12:check
> >> >> > (default) on project beam-parent: Too many files with unapproved
> >> >> license: 1
> >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
> >> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
> >> >> > t-2.3.0-SNAPSHOT.rat
> >> >> >
> >> >> > The earliest build that I see this error is Postcommit #5052 [3].
> >> >> >
> >> >> > This makes me suspect [4] or [5] as a breaking change, since they
> >> change
> >> >> > pom files.
> >> >> >
> >> >> > Questions:
> >> >> > - Is there a way we can reproduce this failure locally? mvn clean
> >> verify
> >> >> > passes locally for me.
> >> >> > - Is there a way we can see the See RAT report mentioned in the
> error
> >> >> > log?
> >> >> >
> >> >> > 2. Prior to onset of #1 Java Precommit builds no longer complete
> >> within
> >> >> > allotted 150 min time. Looking at [6-8] it seems the build makes
> >> >> consistent
> >> >> > progress, but just does not finish on time. We can also see several
> >> >> recent
> >> >> > successful builds with execution time very close to time out
> [9-11].
> >> >> >
> >> >> > I'd like to propose to increase time limit for Java precommit test
> >> suite
> >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we should
> >> >> > definitely try to reduce the test execution time, and reduce
> >> flakiness.
> >> >> > However we need the tests at least pass for now. If we write off
> >> failed
> >> >> > test suites as 'flakes' and merge PRs without having a green test
> >> >> signal,
> >> >> > we will have to spend more time tracing breakages such as #1.
> >> >> >
> >> >> > Thoughts?
> >> >> >
> >> >> > Thanks,
> >> >> > Valentyn
> >> >> >
> >> >> > [1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> ll/
> >> >> > [2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
> all/
> >> >> > [3] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
> >> >> > all/5052/changes
> >> >> >
> >> >> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
> >> >> > c3c67ba3e2617cb7f11a8c
> >> >> > [5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b
> >> >> > 3e9a26b6c9d0bb6dc3
> >> >> >
> >> >> > [6] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> >> ll/15222/
> >> >> > [7] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> >> ll/15195/
> >> >> > [8] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> >> ll/15220/
> >> >> >
> >> >> > [9] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> >> ll/15009/
> >> >> > [10] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> >> ll/15068/
> >> >> > [11] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> >> ll/15016/
> >> >> >
> >> >> >
> >> >>
> >> >
> >> >
> >>
> >
> >
>

Re: Java pre/post commit test suite breakage

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
Wrong link - https://github.com/apache/beam/pull/4027

On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <kl...@google.com> wrote:

> Yea, root cause is the config bug I described. Proposed fix at
> https://github.com/apache/beam/pull/4019/files. I'm working with infra to
> sort out other build issues that are probably not related.
>
> On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik <lc...@google.com.invalid>
> wrote:
>
>> The build breakage I outlined is being tracked in
>> https://issues.apache.org/jira/browse/BEAM-3092
>>
>> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com> wrote:
>>
>> > Another breaking change was caused by https://github.com/apache/
>> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it changed
>> > the build layout on the Jenkins server and our RAT rules expected to
>> apply
>> > from a root directory. I pinged Kenneth Knowles about it and he said he
>> was
>> > taking a look.
>> >
>> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
>> <rangadi@google.com.invalid
>> > > wrote:
>> >
>> >> Regd (1) :
>> >>
>> >> [4] did have have a file without Apache Licence. It was fixed the next
>> >> day (
>> >> commit
>> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
>> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
>> >> thanks to Ken Knowles who pinged me about it.
>> >>
>> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
>> >> valentyn@google.com>
>> >> wrote:
>> >>
>> >> > Hi Beam-Dev,
>> >> >
>> >> > It's been >5 days since the last successful run of a
>> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since last
>> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
>> >> >
>> >> > Looking at build logs I see following problems.
>> >> >
>> >> > 1. After October 17, postcommit builds started to fail with
>> >> >
>> >> > Failed to execute goal org.apache.rat:apache-rat-plugin:0.12:check
>> >> > (default) on project beam-parent: Too many files with unapproved
>> >> license: 1
>> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
>> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
>> >> > t-2.3.0-SNAPSHOT.rat
>> >> >
>> >> > The earliest build that I see this error is Postcommit #5052 [3].
>> >> >
>> >> > This makes me suspect [4] or [5] as a breaking change, since they
>> change
>> >> > pom files.
>> >> >
>> >> > Questions:
>> >> > - Is there a way we can reproduce this failure locally? mvn clean
>> verify
>> >> > passes locally for me.
>> >> > - Is there a way we can see the See RAT report mentioned in the error
>> >> > log?
>> >> >
>> >> > 2. Prior to onset of #1 Java Precommit builds no longer complete
>> within
>> >> > allotted 150 min time. Looking at [6-8] it seems the build makes
>> >> consistent
>> >> > progress, but just does not finish on time. We can also see several
>> >> recent
>> >> > successful builds with execution time very close to time out [9-11].
>> >> >
>> >> > I'd like to propose to increase time limit for Java precommit test
>> suite
>> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we should
>> >> > definitely try to reduce the test execution time, and reduce
>> flakiness.
>> >> > However we need the tests at least pass for now. If we write off
>> failed
>> >> > test suites as 'flakes' and merge PRs without having a green test
>> >> signal,
>> >> > we will have to spend more time tracing breakages such as #1.
>> >> >
>> >> > Thoughts?
>> >> >
>> >> > Thanks,
>> >> > Valentyn
>> >> >
>> >> > [1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/
>> >> > [2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/
>> >> > [3] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
>> >> > all/5052/changes
>> >> >
>> >> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
>> >> > c3c67ba3e2617cb7f11a8c
>> >> > [5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b
>> >> > 3e9a26b6c9d0bb6dc3
>> >> >
>> >> > [6] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> >> ll/15222/
>> >> > [7] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> >> ll/15195/
>> >> > [8] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> >> ll/15220/
>> >> >
>> >> > [9] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> >> ll/15009/
>> >> > [10] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> >> ll/15068/
>> >> > [11] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> >> ll/15016/
>> >> >
>> >> >
>> >>
>> >
>> >
>>
>
>

Re: Java pre/post commit test suite breakage

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
Yea, root cause is the config bug I described. Proposed fix at
https://github.com/apache/beam/pull/4019/files. I'm working with infra to
sort out other build issues that are probably not related.

On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik <lc...@google.com.invalid>
wrote:

> The build breakage I outlined is being tracked in
> https://issues.apache.org/jira/browse/BEAM-3092
>
> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com> wrote:
>
> > Another breaking change was caused by https://github.com/apache/
> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it changed
> > the build layout on the Jenkins server and our RAT rules expected to
> apply
> > from a root directory. I pinged Kenneth Knowles about it and he said he
> was
> > taking a look.
> >
> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
> <rangadi@google.com.invalid
> > > wrote:
> >
> >> Regd (1) :
> >>
> >> [4] did have have a file without Apache Licence. It was fixed the next
> >> day (
> >> commit
> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
> >> thanks to Ken Knowles who pinged me about it.
> >>
> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
> >> valentyn@google.com>
> >> wrote:
> >>
> >> > Hi Beam-Dev,
> >> >
> >> > It's been >5 days since the last successful run of a
> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since last
> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
> >> >
> >> > Looking at build logs I see following problems.
> >> >
> >> > 1. After October 17, postcommit builds started to fail with
> >> >
> >> > Failed to execute goal org.apache.rat:apache-rat-plugin:0.12:check
> >> > (default) on project beam-parent: Too many files with unapproved
> >> license: 1
> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
> >> > t-2.3.0-SNAPSHOT.rat
> >> >
> >> > The earliest build that I see this error is Postcommit #5052 [3].
> >> >
> >> > This makes me suspect [4] or [5] as a breaking change, since they
> change
> >> > pom files.
> >> >
> >> > Questions:
> >> > - Is there a way we can reproduce this failure locally? mvn clean
> verify
> >> > passes locally for me.
> >> > - Is there a way we can see the See RAT report mentioned in the error
> >> > log?
> >> >
> >> > 2. Prior to onset of #1 Java Precommit builds no longer complete
> within
> >> > allotted 150 min time. Looking at [6-8] it seems the build makes
> >> consistent
> >> > progress, but just does not finish on time. We can also see several
> >> recent
> >> > successful builds with execution time very close to time out [9-11].
> >> >
> >> > I'd like to propose to increase time limit for Java precommit test
> suite
> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we should
> >> > definitely try to reduce the test execution time, and reduce
> flakiness.
> >> > However we need the tests at least pass for now. If we write off
> failed
> >> > test suites as 'flakes' and merge PRs without having a green test
> >> signal,
> >> > we will have to spend more time tracing breakages such as #1.
> >> >
> >> > Thoughts?
> >> >
> >> > Thanks,
> >> > Valentyn
> >> >
> >> > [1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/
> >> > [2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/
> >> > [3] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
> >> > all/5052/changes
> >> >
> >> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
> >> > c3c67ba3e2617cb7f11a8c
> >> > [5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b
> >> > 3e9a26b6c9d0bb6dc3
> >> >
> >> > [6] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> ll/15222/
> >> > [7] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> ll/15195/
> >> > [8] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> ll/15220/
> >> >
> >> > [9] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> ll/15009/
> >> > [10] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> ll/15068/
> >> > [11] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
> >> ll/15016/
> >> >
> >> >
> >>
> >
> >
>

Re: Java pre/post commit test suite breakage

Posted by Lukasz Cwik <lc...@google.com.INVALID>.
The build breakage I outlined is being tracked in
https://issues.apache.org/jira/browse/BEAM-3092

On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com> wrote:

> Another breaking change was caused by https://github.com/apache/
> beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because it changed
> the build layout on the Jenkins server and our RAT rules expected to apply
> from a root directory. I pinged Kenneth Knowles about it and he said he was
> taking a look.
>
> On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi <rangadi@google.com.invalid
> > wrote:
>
>> Regd (1) :
>>
>> [4] did have have a file without Apache Licence. It was fixed the next
>> day (
>> commit
>> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
>> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
>> thanks to Ken Knowles who pinged me about it.
>>
>> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
>> valentyn@google.com>
>> wrote:
>>
>> > Hi Beam-Dev,
>> >
>> > It's been >5 days since the last successful run of a
>> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since last
>> > successful run of beam_PreCommit_Java_MavenInstall[2].
>> >
>> > Looking at build logs I see following problems.
>> >
>> > 1. After October 17, postcommit builds started to fail with
>> >
>> > Failed to execute goal org.apache.rat:apache-rat-plugin:0.12:check
>> > (default) on project beam-parent: Too many files with unapproved
>> license: 1
>> > See RAT report in: /home/jenkins/jenkins-slave/wo
>> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
>> > t-2.3.0-SNAPSHOT.rat
>> >
>> > The earliest build that I see this error is Postcommit #5052 [3].
>> >
>> > This makes me suspect [4] or [5] as a breaking change, since they change
>> > pom files.
>> >
>> > Questions:
>> > - Is there a way we can reproduce this failure locally? mvn clean verify
>> > passes locally for me.
>> > - Is there a way we can see the See RAT report mentioned in the error
>> > log?
>> >
>> > 2. Prior to onset of #1 Java Precommit builds no longer complete within
>> > allotted 150 min time. Looking at [6-8] it seems the build makes
>> consistent
>> > progress, but just does not finish on time. We can also see several
>> recent
>> > successful builds with execution time very close to time out [9-11].
>> >
>> > I'd like to propose to increase time limit for Java precommit test suite
>> > from 2.5 to 4 hours. 4 hours is long time. I agree that we should
>> > definitely try to reduce the test execution time, and reduce flakiness.
>> > However we need the tests at least pass for now. If we write off failed
>> > test suites as 'flakes' and merge PRs without having a green test
>> signal,
>> > we will have to spend more time tracing breakages such as #1.
>> >
>> > Thoughts?
>> >
>> > Thanks,
>> > Valentyn
>> >
>> > [1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/
>> > [2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/
>> > [3] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
>> > all/5052/changes
>> >
>> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
>> > c3c67ba3e2617cb7f11a8c
>> > [5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b
>> > 3e9a26b6c9d0bb6dc3
>> >
>> > [6] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> ll/15222/
>> > [7] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> ll/15195/
>> > [8] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> ll/15220/
>> >
>> > [9] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> ll/15009/
>> > [10] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> ll/15068/
>> > [11] https://builds.apache.org/job/beam_PreCommit_Java_MavenInsta
>> ll/15016/
>> >
>> >
>>
>
>

Re: Java pre/post commit test suite breakage

Posted by Lukasz Cwik <lc...@google.com.INVALID>.
Another breaking change was caused by
https://github.com/apache/beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98
because it changed the build layout on the Jenkins server and our RAT rules
expected to apply from a root directory. I pinged Kenneth Knowles about it
and he said he was taking a look.

On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi <ra...@google.com.invalid>
wrote:

> Regd (1) :
>
> [4] did have have a file without Apache Licence. It was fixed the next day
> (
> commit
> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3dc7b83032a
> d38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
> thanks to Ken Knowles who pinged me about it.
>
> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <valentyn@google.com
> >
> wrote:
>
> > Hi Beam-Dev,
> >
> > It's been >5 days since the last successful run of a
> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days since last
> > successful run of beam_PreCommit_Java_MavenInstall[2].
> >
> > Looking at build logs I see following problems.
> >
> > 1. After October 17, postcommit builds started to fail with
> >
> > Failed to execute goal org.apache.rat:apache-rat-plugin:0.12:check
> > (default) on project beam-parent: Too many files with unapproved
> license: 1
> > See RAT report in: /home/jenkins/jenkins-slave/wo
> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
> > t-2.3.0-SNAPSHOT.rat
> >
> > The earliest build that I see this error is Postcommit #5052 [3].
> >
> > This makes me suspect [4] or [5] as a breaking change, since they change
> > pom files.
> >
> > Questions:
> > - Is there a way we can reproduce this failure locally? mvn clean verify
> > passes locally for me.
> > - Is there a way we can see the See RAT report mentioned in the error
> > log?
> >
> > 2. Prior to onset of #1 Java Precommit builds no longer complete within
> > allotted 150 min time. Looking at [6-8] it seems the build makes
> consistent
> > progress, but just does not finish on time. We can also see several
> recent
> > successful builds with execution time very close to time out [9-11].
> >
> > I'd like to propose to increase time limit for Java precommit test suite
> > from 2.5 to 4 hours. 4 hours is long time. I agree that we should
> > definitely try to reduce the test execution time, and reduce flakiness.
> > However we need the tests at least pass for now. If we write off failed
> > test suites as 'flakes' and merge PRs without having a green test signal,
> > we will have to spend more time tracing breakages such as #1.
> >
> > Thoughts?
> >
> > Thanks,
> > Valentyn
> >
> > [1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/
> > [2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/
> > [3] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
> > all/5052/changes
> >
> > [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
> > c3c67ba3e2617cb7f11a8c
> > [5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b
> > 3e9a26b6c9d0bb6dc3
> >
> > [6] https://builds.apache.org/job/beam_PreCommit_Java_
> MavenInstall/15222/
> > [7] https://builds.apache.org/job/beam_PreCommit_Java_
> MavenInstall/15195/
> > [8] https://builds.apache.org/job/beam_PreCommit_Java_
> MavenInstall/15220/
> >
> > [9] https://builds.apache.org/job/beam_PreCommit_Java_
> MavenInstall/15009/
> > [10] https://builds.apache.org/job/beam_PreCommit_Java_
> MavenInstall/15068/
> > [11] https://builds.apache.org/job/beam_PreCommit_Java_
> MavenInstall/15016/
> >
> >
>

Re: Java pre/post commit test suite breakage

Posted by Raghu Angadi <ra...@google.com.INVALID>.
Regd (1) :

[4] did have have a file without Apache Licence. It was fixed the next day (
commit
<https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3dc7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
thanks to Ken Knowles who pinged me about it.

On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <va...@google.com>
wrote:

> Hi Beam-Dev,
>
> It's been >5 days since the last successful run of a
> beam_PreCommit_Java_MavenInstall build[1]  and >4 days since last
> successful run of beam_PreCommit_Java_MavenInstall[2].
>
> Looking at build logs I see following problems.
>
> 1. After October 17, postcommit builds started to fail with
>
> Failed to execute goal org.apache.rat:apache-rat-plugin:0.12:check
> (default) on project beam-parent: Too many files with unapproved license: 1
> See RAT report in: /home/jenkins/jenkins-slave/wo
> rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
> t-2.3.0-SNAPSHOT.rat
>
> The earliest build that I see this error is Postcommit #5052 [3].
>
> This makes me suspect [4] or [5] as a breaking change, since they change
> pom files.
>
> Questions:
> - Is there a way we can reproduce this failure locally? mvn clean verify
> passes locally for me.
> - Is there a way we can see the See RAT report mentioned in the error
> log?
>
> 2. Prior to onset of #1 Java Precommit builds no longer complete within
> allotted 150 min time. Looking at [6-8] it seems the build makes consistent
> progress, but just does not finish on time. We can also see several recent
> successful builds with execution time very close to time out [9-11].
>
> I'd like to propose to increase time limit for Java precommit test suite
> from 2.5 to 4 hours. 4 hours is long time. I agree that we should
> definitely try to reduce the test execution time, and reduce flakiness.
> However we need the tests at least pass for now. If we write off failed
> test suites as 'flakes' and merge PRs without having a green test signal,
> we will have to spend more time tracing breakages such as #1.
>
> Thoughts?
>
> Thanks,
> Valentyn
>
> [1] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/
> [2] https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/
> [3] https://builds.apache.org/job/beam_PostCommit_Java_MavenInst
> all/5052/changes
>
> [4] https://github.com/apache/beam/commit/d745cc9d8cc1735d3b
> c3c67ba3e2617cb7f11a8c
> [5] https://github.com/apache/beam/commit/0d8ab6cbbc762dd9f9be1b
> 3e9a26b6c9d0bb6dc3
>
> [6] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15222/
> [7] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15195/
> [8] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15220/
>
> [9] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15009/
> [10] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15068/
> [11] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15016/
>
>