You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Ashvin A <aa...@gmail.com> on 2015/08/19 20:49:33 UTC

Investigate test failures: Build 189

Hi,

The latest nightly build is reporting 4 test failures. These tests do not
fail when I run them locally. Based on a quick search it also seems these
tests failed for the first time and are not related to recent code changes.

   1. c.g.g.i.c.ClientServerTransactionDUnitTest.testClientCommitFunction
   2. c.g.g.i.c.ConnectDisconnectDUnitTest.testManyConnectsAndDisconnects
   3.
   c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionLocalDestroyRegion
   4. c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionClose


All these failures have similar error message:

Found suspect string in log4j at line 1430
    com.gemstone.gemfire.cache.RegionDestroyedException:...

Can I conclude these failures are caused by some build hardware issue? Are
there any other artifacts I could look at for additional details.

Thanks,
Ashvin

Re: Investigate test failures: Build 189

Posted by Bruce Schuchardt <bs...@pivotal.io>.
The test output is available if you click on the failed test.

ConnectDisconnectDUnitTest is failing due to suspect strings shutting 
down the DistributedSystem from a previous test.  These suspect strings 
are not the RegionDestroyedException that Ashvin mentioned.  I think the 
exceptions point to a small shutdown problem in 
RegionVersionVector.memberDeparted().  It's trying to get the 
DistributionManager and schedule a job in one of its thread pools but 
the DM is shutting down and getDistributionManager() is throwing an 
exception.

ClientServerTransactionDUnitTest got an unexpected 
TransactionDataNodeHasDepartedException that doesn't look like 
bleed-through from another test to me.

PartitionedRegionAsSubRegionDUnitTests are failing due to suspect 
RegionDestroyedException strings that are not from 
bleed-through/contamination.  They are due to timing issues with the 
test.  It's locally destroying a bucket while the bucket is being 
created and that caused the thread that was creating the bucket to log 
the exception.



Le 8/19/2015 1:40 PM, Kirk Lund a écrit :
> I don't have any ideas about the cause without looking at the full
> stack of RegionDestroyedException
> and the rest of the logs. We haven't been seeing anything like this so I
> would expect it to be a real bug that was committed.
>
> It might be caused by race condition(s) in either the test or the product
> (or both). This could then this result in pollution in one or
> more JVMs which then causes later tests (so it might be the earliest
> misbehaving test that's of interest).
>
> We usually copy entire test run to somewhere, merge them and then study the
> logs and code. It's not quick or easy and I'm not sure where ASF is
> archiving the results. I would expect the archived results to be found if
> you dig through all the screens at
> https://builds.apache.org/job/Geode-nightly/189/ (or whatever the build #
> is).
>
> -Kirk
>
>
> On Wed, Aug 19, 2015 at 11:49 AM, Ashvin A <aa...@gmail.com> wrote:
>
>> Hi,
>>
>> The latest nightly build is reporting 4 test failures. These tests do not
>> fail when I run them locally. Based on a quick search it also seems these
>> tests failed for the first time and are not related to recent code changes.
>>
>>     1. c.g.g.i.c.ClientServerTransactionDUnitTest.testClientCommitFunction
>>     2. c.g.g.i.c.ConnectDisconnectDUnitTest.testManyConnectsAndDisconnects
>>     3.
>>
>>   c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionLocalDestroyRegion
>>     4. c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionClose
>>
>>
>> All these failures have similar error message:
>>
>> Found suspect string in log4j at line 1430
>>      com.gemstone.gemfire.cache.RegionDestroyedException:...
>>
>> Can I conclude these failures are caused by some build hardware issue? Are
>> there any other artifacts I could look at for additional details.
>>
>> Thanks,
>> Ashvin
>>


Re: Investigate test failures: Build 189

Posted by Kirk Lund <kl...@pivotal.io>.
I don't have any ideas about the cause without looking at the full
stack of RegionDestroyedException
and the rest of the logs. We haven't been seeing anything like this so I
would expect it to be a real bug that was committed.

It might be caused by race condition(s) in either the test or the product
(or both). This could then this result in pollution in one or
more JVMs which then causes later tests (so it might be the earliest
misbehaving test that's of interest).

We usually copy entire test run to somewhere, merge them and then study the
logs and code. It's not quick or easy and I'm not sure where ASF is
archiving the results. I would expect the archived results to be found if
you dig through all the screens at
https://builds.apache.org/job/Geode-nightly/189/ (or whatever the build #
is).

-Kirk


On Wed, Aug 19, 2015 at 11:49 AM, Ashvin A <aa...@gmail.com> wrote:

> Hi,
>
> The latest nightly build is reporting 4 test failures. These tests do not
> fail when I run them locally. Based on a quick search it also seems these
> tests failed for the first time and are not related to recent code changes.
>
>    1. c.g.g.i.c.ClientServerTransactionDUnitTest.testClientCommitFunction
>    2. c.g.g.i.c.ConnectDisconnectDUnitTest.testManyConnectsAndDisconnects
>    3.
>
>  c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionLocalDestroyRegion
>    4. c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionClose
>
>
> All these failures have similar error message:
>
> Found suspect string in log4j at line 1430
>     com.gemstone.gemfire.cache.RegionDestroyedException:...
>
> Can I conclude these failures are caused by some build hardware issue? Are
> there any other artifacts I could look at for additional details.
>
> Thanks,
> Ashvin
>

Re: Investigate test failures: Build 189

Posted by Dan Smith <ds...@pivotal.io>.
I'm pretty sure the PartitionedRegionAsSubRegionDUnitTests are a new race
introduced by my changes for GEODE-74
<https://issues.apache.org/jira/browse/GEODE-74>. More work is happening in
a background thread, and it looks like the thread logged a warning when it
saw this exception. I'll file a bug and fix it.

-Dan

On Wed, Aug 19, 2015 at 1:43 PM, Darrel Schneider <ds...@pivotal.io>
wrote:

> I the first one different? It looks like GEODE192:
> Caused by:
> com.gemstone.gemfire.cache.TransactionDataNodeHasDepartedException: Could
> not connect to member:lucene1-us-west(18211)<v81>:18779
> at
>
> com.gemstone.gemfire.internal.cache.execute.TransactionFunctionService.onTransaction(TransactionFunctionService.java:82)
> at
>
> com.gemstone.gemfire.internal.cache.ClientServerTransactionDUnitTest$99.call(ClientServerTransactionDUnitTest.java:2748)
>
>
> On Wed, Aug 19, 2015 at 11:49 AM, Ashvin A <aa...@gmail.com> wrote:
>
> > Hi,
> >
> > The latest nightly build is reporting 4 test failures. These tests do not
> > fail when I run them locally. Based on a quick search it also seems these
> > tests failed for the first time and are not related to recent code
> changes.
> >
> >    1. c.g.g.i.c.ClientServerTransactionDUnitTest.testClientCommitFunction
> >    2. c.g.g.i.c.ConnectDisconnectDUnitTest.testManyConnectsAndDisconnects
> >    3.
> >
> >
> c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionLocalDestroyRegion
> >    4. c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionClose
> >
> >
> > All these failures have similar error message:
> >
> > Found suspect string in log4j at line 1430
> >     com.gemstone.gemfire.cache.RegionDestroyedException:...
> >
> > Can I conclude these failures are caused by some build hardware issue?
> Are
> > there any other artifacts I could look at for additional details.
> >
> > Thanks,
> > Ashvin
> >
>

Re: Investigate test failures: Build 189

Posted by Darrel Schneider <ds...@pivotal.io>.
I the first one different? It looks like GEODE192:
Caused by:
com.gemstone.gemfire.cache.TransactionDataNodeHasDepartedException: Could
not connect to member:lucene1-us-west(18211)<v81>:18779
at
com.gemstone.gemfire.internal.cache.execute.TransactionFunctionService.onTransaction(TransactionFunctionService.java:82)
at
com.gemstone.gemfire.internal.cache.ClientServerTransactionDUnitTest$99.call(ClientServerTransactionDUnitTest.java:2748)


On Wed, Aug 19, 2015 at 11:49 AM, Ashvin A <aa...@gmail.com> wrote:

> Hi,
>
> The latest nightly build is reporting 4 test failures. These tests do not
> fail when I run them locally. Based on a quick search it also seems these
> tests failed for the first time and are not related to recent code changes.
>
>    1. c.g.g.i.c.ClientServerTransactionDUnitTest.testClientCommitFunction
>    2. c.g.g.i.c.ConnectDisconnectDUnitTest.testManyConnectsAndDisconnects
>    3.
>
>  c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionLocalDestroyRegion
>    4. c.g.g.i.c.PartitionedRegionAsSubRegionDUnitTest.testSubRegionClose
>
>
> All these failures have similar error message:
>
> Found suspect string in log4j at line 1430
>     com.gemstone.gemfire.cache.RegionDestroyedException:...
>
> Can I conclude these failures are caused by some build hardware issue? Are
> there any other artifacts I could look at for additional details.
>
> Thanks,
> Ashvin
>