You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Jason Huynh (JIRA)" <ji...@apache.org> on 2018/09/06 17:54:00 UTC

[jira] [Commented] (GEODE-5700) CI failures from new tests in PartitionedRegionCompactRangeIndexDUnitTest

    [ https://issues.apache.org/jira/browse/GEODE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606173#comment-16606173 ] 

Jason Huynh commented on GEODE-5700:
------------------------------------

A race condition between stopping a cache server and a thread volunteering to become primary can cause this issue.  The volunteer for primary usually would throw a cancel exception but in this case, the cache is not closed/closing at the moment and so it throws an IllegalStateException instead.  This exception gets logged whereas a cancel exception does not.

The volunteer for primary also must have gotten the list of cache servers prior to the cache server shutting down.  It iterates over this stale list and must have gotten past the isRunning() check too...

 

I think the ServerStarterRule we close the cacheServer explicitly without closing the cache.  A few lines later we close the cache.  I am not sure why this is done in two steps ( I believe Cache close should stop all cache servers )

Going to investigate a bit further but this is my hypothesis for now...

 

> CI failures from new tests in PartitionedRegionCompactRangeIndexDUnitTest
> -------------------------------------------------------------------------
>
>                 Key: GEODE-5700
>                 URL: https://issues.apache.org/jira/browse/GEODE-5700
>             Project: Geode
>          Issue Type: Improvement
>            Reporter: Dan Smith
>            Assignee: Jason Huynh
>            Priority: Major
>              Labels: swat
>
> We are seeing a couple of the new tests in PartitionedRegionCompactRangeIndexDUnitTest fail in CI
> {noformat}
> org.apache.geode.cache.query.dunit.PartitionedRegionCompactRangeIndexDUnitTest:  2 failures (99.265% success rate)
>  |  .giiWithPersistenceAndStaleDataDueToDeletesShouldHaveEmptyIndexesWithEntrySet:  1 failures (99.632% success rate)
>  |   |  Failed build 376  at https://concourse.apachegeode-ci.info/teams/staging/pipelines/concourse-staging/jobs/DistributedTest/builds/376
>  |  .giiWithPersistenceAndStaleDataDueToDeletesShouldHaveEmptyIndexes:  1 failures (99.632% success rate)
>  |   |  Failed build 499  at https://concourse.apachegeode-ci.info/teams/staging/pipelines/concourse-staging/jobs/DistributedTest/builds/499
> {noformat}
> {noformat}
> org.apache.geode.cache.query.dunit.PartitionedRegionCompactRangeIndexDUnitTest > giiWithPersistenceAndStaleDataDueToDeletesShouldHaveEmptyIndexes FAILED
>     java.lang.AssertionError: Suspicious strings were written to the log during this run.
>     Fix the strings or use IgnoredException.addIgnoredException to ignore.
>     -----------------------------------------------------------------------
>     Found suspect string in log4j at line 7947
> 	
>     [error 2018/08/30 21:32:07.028 UTC <Pooled Waiting Message Processor 1> tid=0x9d6] A bridge server's bind address is only available if it has been started
>     java.lang.IllegalStateException: A bridge server's bind address is only available if it has been started
>     	at org.apache.geode.internal.cache.CacheServerImpl.getExternalAddress(CacheServerImpl.java:415)
>        at org.apache.geode.internal.cache.CacheServerImpl.getExternalAddress(CacheServerImpl.java:407)
>     	at org.apache.geode.internal.cache.BucketAdvisor.instantiateProfile(BucketAdvisor.java:1690)
>     	at org.apache.geode.distributed.internal.DistributionAdvisor.createProfile(DistributionAdvisor.java:1026)
>     	at org.apache.geode.internal.cache.BucketAdvisor.sendProfileUpdate(BucketAdvisor.java:1651)
>     	at org.apache.geode.internal.cache.BucketAdvisor.acquiredPrimaryLock(BucketAdvisor.java:1196)
>     	at org.apache.geode.internal.cache.BucketAdvisor$VolunteeringDelegate.doVolunteerForPrimary(BucketAdvisor.java:2586)
>     	at org.apache.geode.internal.cache.BucketAdvisor$VolunteeringDelegate$1.run(BucketAdvisor.java:2484)
>     	at org.apache.geode.internal.cache.BucketAdvisor$VolunteeringDelegate$2.run(BucketAdvisor.java:2803)
>     	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     	at org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1136)
>     	at org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:112)
>     	at org.apache.geode.distributed.internal.ClusterDistributionManager$6$1.run(ClusterDistributionManager.java:882)
>     	at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)