You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by Mike Matrigali <mi...@sbcglobal.net> on 2013/04/05 19:43:55 UTC

concern about nightly test failures on 10.10 branch prior to release.

I have been looking at nightly results of the last few weeks, with an
eye to making sure 10.10 release does not have regressions over previous
releases.

I think in the past we have tried to get clean nightly test runs before 
making a release.  It is a problem as there are known intermittent 
errors, which make it hard to know if errors are regressions or not.

Looking at the public nightly test runs for 10.10 I see:

Java DB Testing:
http://download.java.net/javadesktop/derby/10.10.html
currently has 1 in a row clean recent runs, 3 out of most recent 7 have 
failures.  I think it is only running on checkins, so testing
of intermittent bugs is sparse.

IBM Testing:
http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/derbyall_history.html
http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/suites.All_history.html
http://people.apache.org/~myrnavl/derby_test_results/v10_10/linux/derbyall_history.html
http://people.apache.org/~myrnavl/derby_test_results/v10_10/linux/suites.All_history.html
currently has 0 in a row clean recent runs, and I don't think there
has been a "clean" day for 2 weeks.

I have not had time to look at all the failures, to determine if they
are regressions or not.  While not totally clean the 10.9 runs
for the IBM Testing are much cleaner, so just using that metric it
seems 10.10 is not ready to ship.:
http://people.apache.org/~myrnavl/derby_test_results/v10_9/windows/suites.All_history.html

Re: concern about nightly test failures on 10.10 branch prior to release.

Posted by Rick Hillegas <ri...@oracle.com>.

On 4/5/13 10:43 AM, Mike Matrigali wrote:
> I have been looking at nightly results of the last few weeks, with an
> eye to making sure 10.10 release does not have regressions over previous
> releases.
>
> I think in the past we have tried to get clean nightly test runs 
> before making a release.  It is a problem as there are known 
> intermittent errors, which make it hard to know if errors are 
> regressions or not.
>
> Looking at the public nightly test runs for 10.10 I see:
>
> Java DB Testing:
> http://download.java.net/javadesktop/derby/10.10.html
> currently has 1 in a row clean recent runs, 3 out of most recent 7 
> have failures.  I think it is only running on checkins, so testing
> of intermittent bugs is sparse.
>
> IBM Testing:
> http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/derbyall_history.html 
>
> http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/suites.All_history.html 
>
> http://people.apache.org/~myrnavl/derby_test_results/v10_10/linux/derbyall_history.html 
>
> http://people.apache.org/~myrnavl/derby_test_results/v10_10/linux/suites.All_history.html 
>
> currently has 0 in a row clean recent runs, and I don't think there
> has been a "clean" day for 2 weeks.
>
> I have not had time to look at all the failures, to determine if they
> are regressions or not.  While not totally clean the 10.9 runs
> for the IBM Testing are much cleaner, so just using that metric it
> seems 10.10 is not ready to ship.:
> http://people.apache.org/~myrnavl/derby_test_results/v10_9/windows/suites.All_history.html 
>
>
Thanks to Mike for raising this issue and thanks to Knut for analyzing 
the problems in the Oracle test lab. I also am disappointed by the 
signal to noise ratio in the nightly/continuous test results coming out 
of the Oracle lab. That lab is still being debugged. I did a quick 
calculation of bad runs vs. total runs for Oracle tests on the 10.10, 
10.9, and 10.8 branches:

10.10: 30% failure rate
10.9: 44% failure rate
10.8: 41% failure rate

We clearly need to stabilize the Oracle lab. And the Derby tests have 
too many heisenbugs. But the results for the 10.10 branch don't look 
worse to me than the results for the 10.9 and 10.8 branches.

Moving on to the release candidate itself, here's a comparison of 
distinct test failures reported during platform testing of the last 3 
feature releases:

10.10.1: 4 distinct failures
10.9.1: 8 distinct failures
10.8.1: 10 distinct failures

Again, the platform test results for 10.10.1 don't look worse to me than 
the results for 10.9.1 and 10.8.1.

I'm prepared to extend the vote by a week if that would help people 
analyze the failures seen in the IBM lab. Let me know if I should do that.

Thanks,
-Rick

Re: concern about nightly test failures on 10.10 branch prior to release.

Posted by Knut Anders Hatlen <kn...@oracle.com>.

Mike Matrigali <mi...@sbcglobal.net> writes:

> I have been looking at nightly results of the last few weeks, with an
> eye to making sure 10.10 release does not have regressions over previous
> releases.
>
> I think in the past we have tried to get clean nightly test runs
> before making a release.  It is a problem as there are known
> intermittent errors, which make it hard to know if errors are
> regressions or not.
>
> Looking at the public nightly test runs for 10.10 I see:
>
> Java DB Testing:
> http://download.java.net/javadesktop/derby/10.10.html
> currently has 1 in a row clean recent runs, 3 out of most recent 7
> have failures.  I think it is only running on checkins, so testing
> of intermittent bugs is sparse.

I went through the failures in the above link. I found these:

1) some occurrences of DERBY-973, DERBY-3519 and DERBY-5172

2) sometimes Java 5 dumps core on Solaris SPARC

3) sometimes the compatibility tests fail to start on Solaris x64
because there is not enough memory to start a new process

4) sometimes the replication tests fail to start on Solaris x64

5) once, the JMX tests timed out when connecting to the MBean server

6) once, some network tests in derbyall failed on Java 5, 32-bit Linux,
with permission problems and UnknownHostExceptions

The bugs that cause (1) are known intermittent bugs, not new in 10.10.

(2) must be a JVM bug.

(3) is an environment problem.

I suspect (3) and (4) might be related, only that the replication tests
hide the real cause of the failure. In any case, (4) also happens in the
testing of the 10.8 and 10.9 branches, so it's most likely not a new
problem in 10.10.

(5) has also been seen once on the 10.9 branch, so it's likely not a new
problem.

That leaves us with (6), which hasn't been seen on older branches, as
far as I can see. It only happened once on the 10.10 branch. Here's a
link to the failure:

http://download.java.net/javadesktop/derby/javadb-5574721-report/javadb-5574721-3624064-details.html

The failure has the feel of a network glitch where the local host name
cannot be resolved. I wouldn't make this particular issue a blocker for
10.10 unless we see it happening more than once.

> IBM Testing:
> http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/derbyall_history.html
> http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/suites.All_history.html
> http://people.apache.org/~myrnavl/derby_test_results/v10_10/linux/derbyall_history.html
> http://people.apache.org/~myrnavl/derby_test_results/v10_10/linux/suites.All_history.html
> currently has 0 in a row clean recent runs, and I don't think there
> has been a "clean" day for 2 weeks.
>
> I have not had time to look at all the failures, to determine if they
> are regressions or not.  While not totally clean the 10.9 runs
> for the IBM Testing are much cleaner, so just using that metric it
> seems 10.10 is not ready to ship.:
> http://people.apache.org/~myrnavl/derby_test_results/v10_9/windows/suites.All_history.html