You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@river.apache.org by Patricia Shanahan <pa...@acm.org> on 2010/08/24 21:58:20 UTC

Ignored tests

I ran a batch of the previously ignored QA tests overnight. I got 156 
passes and 64 failures. This is nowhere near as bad as it sounds, 
because many of the failures were clusters of related tests failing in 
similar ways, suggesting a single problem affecting the base 
infrastructure for the test category. Some of the failures may relate to 
the known regression that Peter is going to look at this week.

Also, it is important to remember that the bugs may be in the tests, not 
in the code under test. A test may be obsolete, depending on behavior 
that is no longer supported.

I do think there is a good enough chance that at least one of the 
failures represents a real problem, and an opportunity to improve River, 
that I plan to start a background activity looking at failed tests to 
see what is going on. The objective is to do one of three things for 
each cluster of failures:

1. Fix River.

2. Fix the test.

3. Decide the test is unfixable, and delete it. There is no point 
spending disk space, file transfer time, and test load time on tests we 
are never going to run.

Running the subset I did last night took about 15 hours, but that 
included a lot of timeouts.

Patricia

Re: Ignored tests

Posted by Jonathan Costers <jo...@googlemail.com>.
The most common ones are:
- failures to find any Kerberos configuration file (/etc/krb5.conf or
similar) -> Kerberos infrastructure
- failures to find certain host names (for instance: jiniproxy -> proxy
infrastructure)
I believe JIRA issues exist for the missing Kerberos and proxy
infrastructure.

Any others are to be looked at with suspicion and to be handled on a case by
case basis.


2010/8/25 Patricia Shanahan <pa...@acm.org>

> Can you give any guidance on how to find out which tests need what
> infrastructure? Is it documented somewhere? I'm still learning my way around
> the River files.
>
> Also, I'm interested in tests that fail unexpectedly, especially any tests
> that have regressed or fail intermittently without related source code
> changes.
>
> I have a suspicion, based on source code reading, of a race condition in
> ServiceDiscoveryManager, and problems related to retries in some subclasses
> of RetryTask. If these problems are real they would tend to lead to
> unreproducible, intermittent failures rather than solid failures.
>
> Patricia
>
>
>
> On 8/25/2010 2:30 PM, Jonathan Costers wrote:
>
>> There is one more test category that we could add to the list that is used
>> by Hudson: "renewalmanager".
>> All the other categories have one or more issues (I have run all these
>> tests
>> myself many, many times), mostly because of missing infrastructure, but
>> some
>> also fail unexpectedly.
>>
>>
>> 2010/8/24 Patricia Shanahan<pa...@acm.org>
>>
>>  I'm not sure how much that would tell us, done on a bulk basis, because
>>> some of the tests will be specific to bugs that were found and fixed
>>> after
>>> then.
>>>
>>> I will be doing something similar for individual tests, but taking into
>>> account what their comments tell me about which versions are expected to
>>> pass.
>>>
>>> Patricia
>>>
>>>
>>>
>>> On 8/24/2010 1:02 PM, Patrick Wright wrote:
>>>
>>>  Hi Patricia
>>>>
>>>> Is there perhaps a solid baseline to test against, for example Jini
>>>> 2.1 to see how many pass/fails we get?
>>>>
>>>> Thanks for all the hard work
>>>> Patrick
>>>>
>>>> On Tue, Aug 24, 2010 at 9:58 PM, Patricia Shanahan<pa...@acm.org>
>>>> wrote:
>>>>
>>>>  I ran a batch of the previously ignored QA tests overnight. I got 156
>>>>> passes
>>>>> and 64 failures. This is nowhere near as bad as it sounds, because many
>>>>> of
>>>>> the failures were clusters of related tests failing in similar ways,
>>>>> suggesting a single problem affecting the base infrastructure for the
>>>>> test
>>>>> category. Some of the failures may relate to the known regression that
>>>>> Peter
>>>>> is going to look at this week.
>>>>>
>>>>> Also, it is important to remember that the bugs may be in the tests,
>>>>> not
>>>>> in
>>>>> the code under test. A test may be obsolete, depending on behavior that
>>>>> is
>>>>> no longer supported.
>>>>>
>>>>> I do think there is a good enough chance that at least one of the
>>>>> failures
>>>>> represents a real problem, and an opportunity to improve River, that I
>>>>> plan
>>>>> to start a background activity looking at failed tests to see what is
>>>>> going
>>>>> on. The objective is to do one of three things for each cluster of
>>>>> failures:
>>>>>
>>>>> 1. Fix River.
>>>>>
>>>>> 2. Fix the test.
>>>>>
>>>>> 3. Decide the test is unfixable, and delete it. There is no point
>>>>> spending
>>>>> disk space, file transfer time, and test load time on tests we are
>>>>> never
>>>>> going to run.
>>>>>
>>>>> Running the subset I did last night took about 15 hours, but that
>>>>> included a
>>>>> lot of timeouts.
>>>>>
>>>>> Patricia
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Ignored tests

Posted by Patricia Shanahan <pa...@acm.org>.
Can you give any guidance on how to find out which tests need what 
infrastructure? Is it documented somewhere? I'm still learning my way 
around the River files.

Also, I'm interested in tests that fail unexpectedly, especially any 
tests that have regressed or fail intermittently without related source 
code changes.

I have a suspicion, based on source code reading, of a race condition in 
ServiceDiscoveryManager, and problems related to retries in some 
subclasses of RetryTask. If these problems are real they would tend to 
lead to unreproducible, intermittent failures rather than solid failures.

Patricia


On 8/25/2010 2:30 PM, Jonathan Costers wrote:
> There is one more test category that we could add to the list that is used
> by Hudson: "renewalmanager".
> All the other categories have one or more issues (I have run all these tests
> myself many, many times), mostly because of missing infrastructure, but some
> also fail unexpectedly.
>
>
> 2010/8/24 Patricia Shanahan<pa...@acm.org>
>
>> I'm not sure how much that would tell us, done on a bulk basis, because
>> some of the tests will be specific to bugs that were found and fixed after
>> then.
>>
>> I will be doing something similar for individual tests, but taking into
>> account what their comments tell me about which versions are expected to
>> pass.
>>
>> Patricia
>>
>>
>>
>> On 8/24/2010 1:02 PM, Patrick Wright wrote:
>>
>>> Hi Patricia
>>>
>>> Is there perhaps a solid baseline to test against, for example Jini
>>> 2.1 to see how many pass/fails we get?
>>>
>>> Thanks for all the hard work
>>> Patrick
>>>
>>> On Tue, Aug 24, 2010 at 9:58 PM, Patricia Shanahan<pa...@acm.org>   wrote:
>>>
>>>> I ran a batch of the previously ignored QA tests overnight. I got 156
>>>> passes
>>>> and 64 failures. This is nowhere near as bad as it sounds, because many
>>>> of
>>>> the failures were clusters of related tests failing in similar ways,
>>>> suggesting a single problem affecting the base infrastructure for the
>>>> test
>>>> category. Some of the failures may relate to the known regression that
>>>> Peter
>>>> is going to look at this week.
>>>>
>>>> Also, it is important to remember that the bugs may be in the tests, not
>>>> in
>>>> the code under test. A test may be obsolete, depending on behavior that
>>>> is
>>>> no longer supported.
>>>>
>>>> I do think there is a good enough chance that at least one of the
>>>> failures
>>>> represents a real problem, and an opportunity to improve River, that I
>>>> plan
>>>> to start a background activity looking at failed tests to see what is
>>>> going
>>>> on. The objective is to do one of three things for each cluster of
>>>> failures:
>>>>
>>>> 1. Fix River.
>>>>
>>>> 2. Fix the test.
>>>>
>>>> 3. Decide the test is unfixable, and delete it. There is no point
>>>> spending
>>>> disk space, file transfer time, and test load time on tests we are never
>>>> going to run.
>>>>
>>>> Running the subset I did last night took about 15 hours, but that
>>>> included a
>>>> lot of timeouts.
>>>>
>>>> Patricia
>>>>
>>>>
>>>
>>
>


Re: Ignored tests

Posted by Jonathan Costers <jo...@googlemail.com>.
There is one more test category that we could add to the list that is used
by Hudson: "renewalmanager".
All the other categories have one or more issues (I have run all these tests
myself many, many times), mostly because of missing infrastructure, but some
also fail unexpectedly.


2010/8/24 Patricia Shanahan <pa...@acm.org>

> I'm not sure how much that would tell us, done on a bulk basis, because
> some of the tests will be specific to bugs that were found and fixed after
> then.
>
> I will be doing something similar for individual tests, but taking into
> account what their comments tell me about which versions are expected to
> pass.
>
> Patricia
>
>
>
> On 8/24/2010 1:02 PM, Patrick Wright wrote:
>
>> Hi Patricia
>>
>> Is there perhaps a solid baseline to test against, for example Jini
>> 2.1 to see how many pass/fails we get?
>>
>> Thanks for all the hard work
>> Patrick
>>
>> On Tue, Aug 24, 2010 at 9:58 PM, Patricia Shanahan<pa...@acm.org>  wrote:
>>
>>> I ran a batch of the previously ignored QA tests overnight. I got 156
>>> passes
>>> and 64 failures. This is nowhere near as bad as it sounds, because many
>>> of
>>> the failures were clusters of related tests failing in similar ways,
>>> suggesting a single problem affecting the base infrastructure for the
>>> test
>>> category. Some of the failures may relate to the known regression that
>>> Peter
>>> is going to look at this week.
>>>
>>> Also, it is important to remember that the bugs may be in the tests, not
>>> in
>>> the code under test. A test may be obsolete, depending on behavior that
>>> is
>>> no longer supported.
>>>
>>> I do think there is a good enough chance that at least one of the
>>> failures
>>> represents a real problem, and an opportunity to improve River, that I
>>> plan
>>> to start a background activity looking at failed tests to see what is
>>> going
>>> on. The objective is to do one of three things for each cluster of
>>> failures:
>>>
>>> 1. Fix River.
>>>
>>> 2. Fix the test.
>>>
>>> 3. Decide the test is unfixable, and delete it. There is no point
>>> spending
>>> disk space, file transfer time, and test load time on tests we are never
>>> going to run.
>>>
>>> Running the subset I did last night took about 15 hours, but that
>>> included a
>>> lot of timeouts.
>>>
>>> Patricia
>>>
>>>
>>
>

Re: Ignored tests

Posted by Patricia Shanahan <pa...@acm.org>.
I'm not sure how much that would tell us, done on a bulk basis, because 
some of the tests will be specific to bugs that were found and fixed 
after then.

I will be doing something similar for individual tests, but taking into 
account what their comments tell me about which versions are expected to 
pass.

Patricia


On 8/24/2010 1:02 PM, Patrick Wright wrote:
> Hi Patricia
>
> Is there perhaps a solid baseline to test against, for example Jini
> 2.1 to see how many pass/fails we get?
>
> Thanks for all the hard work
> Patrick
>
> On Tue, Aug 24, 2010 at 9:58 PM, Patricia Shanahan<pa...@acm.org>  wrote:
>> I ran a batch of the previously ignored QA tests overnight. I got 156 passes
>> and 64 failures. This is nowhere near as bad as it sounds, because many of
>> the failures were clusters of related tests failing in similar ways,
>> suggesting a single problem affecting the base infrastructure for the test
>> category. Some of the failures may relate to the known regression that Peter
>> is going to look at this week.
>>
>> Also, it is important to remember that the bugs may be in the tests, not in
>> the code under test. A test may be obsolete, depending on behavior that is
>> no longer supported.
>>
>> I do think there is a good enough chance that at least one of the failures
>> represents a real problem, and an opportunity to improve River, that I plan
>> to start a background activity looking at failed tests to see what is going
>> on. The objective is to do one of three things for each cluster of failures:
>>
>> 1. Fix River.
>>
>> 2. Fix the test.
>>
>> 3. Decide the test is unfixable, and delete it. There is no point spending
>> disk space, file transfer time, and test load time on tests we are never
>> going to run.
>>
>> Running the subset I did last night took about 15 hours, but that included a
>> lot of timeouts.
>>
>> Patricia
>>
>


Re: Ignored tests

Posted by Patrick Wright <pd...@gmail.com>.
Hi Patricia

Is there perhaps a solid baseline to test against, for example Jini
2.1 to see how many pass/fails we get?

Thanks for all the hard work
Patrick

On Tue, Aug 24, 2010 at 9:58 PM, Patricia Shanahan <pa...@acm.org> wrote:
> I ran a batch of the previously ignored QA tests overnight. I got 156 passes
> and 64 failures. This is nowhere near as bad as it sounds, because many of
> the failures were clusters of related tests failing in similar ways,
> suggesting a single problem affecting the base infrastructure for the test
> category. Some of the failures may relate to the known regression that Peter
> is going to look at this week.
>
> Also, it is important to remember that the bugs may be in the tests, not in
> the code under test. A test may be obsolete, depending on behavior that is
> no longer supported.
>
> I do think there is a good enough chance that at least one of the failures
> represents a real problem, and an opportunity to improve River, that I plan
> to start a background activity looking at failed tests to see what is going
> on. The objective is to do one of three things for each cluster of failures:
>
> 1. Fix River.
>
> 2. Fix the test.
>
> 3. Decide the test is unfixable, and delete it. There is no point spending
> disk space, file transfer time, and test load time on tests we are never
> going to run.
>
> Running the subset I did last night took about 15 hours, but that included a
> lot of timeouts.
>
> Patricia
>