You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org> on 2014/07/22 08:21:38 UTC

[jira] [Updated] (SOLR-6231) RollingRestartTest failures on jenkins

     [ https://issues.apache.org/jira/browse/SOLR-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-6231:
----------------------------------------

    Attachment: SOLR-6231.patch

The good thing about this failure is that in all instances I've seen, we always have an overseer. It's just that the overseer is not one of the designates. I looked at the logs of a few failures and it seemed like the re-prioritization was in process and we timed out early.

Here's a patch to harden the process. We have a max timeout of 300 seconds and a smaller 60 second timeout for finding designates which is adjusted further and further ahead as we find new overseers being elected. The idea is that if within 60 seconds, the overseer hasn't changed, then we're likely not going to find a new overseer and we should stop. But if the overseer changed then re-prioritization is in progress and we should wait more.

> RollingRestartTest failures on jenkins
> --------------------------------------
>
>                 Key: SOLR-6231
>                 URL: https://issues.apache.org/jira/browse/SOLR-6231
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud, Tests
>            Reporter: Shalin Shekhar Mangar
>             Fix For: 4.10
>
>         Attachments: SOLR-6231.patch
>
>
> A somewhat rare fail on jenkins. An overseer was available to service requests but even after waiting for 60 seconds, none of the designates were the overseer.
> {code}
> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/4081/
> Java: 32bit/jdk1.8.0_20-ea-b21 -client -XX:+UseSerialGC
> 1 tests failed.
> REGRESSION:  org.apache.solr.cloud.RollingRestartTest.testDistribSearch
> Error Message:
> No overseer designate as leader found after restart #3: 127.0.0.1:60996_
> Stack Trace:
> java.lang.AssertionError: No overseer designate as leader found after restart #3: 127.0.0.1:60996_
>         at __randomizedtesting.SeedInfo.seed([5263BF570390CF79:D385314F74CFAF45]:0)
>         at org.junit.Assert.fail(Assert.java:93)
>         at org.apache.solr.cloud.RollingRestartTest.restartWithRolesTest(RollingRestartTest.java:100)
>         at org.apache.solr.cloud.RollingRestartTest.doTest(RollingRestartTest.java:61)
>         at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:865)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org