You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2017/03/14 21:04:41 UTC

[jira] [Commented] (SOLR-6286) TestReplicationHandler.doTestReplicateAfterCoreReload failure on jenkins

    [ https://issues.apache.org/jira/browse/SOLR-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924991#comment-15924991 ] 

Hoss Man commented on SOLR-6286:
--------------------------------

I recently filed a similar issue but only now noticed this one -- i'll resolve SOLR-10251 as a dup, but here are my particular observations on this test/failure from that issue...

{noformat}
  [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestReplicationHandler -Dtests.method=doTestReplicateAfterCoreReload -Dtests.seed=6F2AD3669775C0E9 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=ky-KG -Dtests.timezone=Etc/GMT+10 -Dtests.asserts=true -Dtests.file.encoding=ANSI_X3.4-1968
   [junit4] FAILURE 57.2s | TestReplicationHandler.doTestReplicateAfterCoreReload <<<

{noformat}
{quote}
A few misc observations...
* this line is comparing the commits on master _now_ to the commits on master just prior to a core reload
** so failure has nothing to do with replicaiton
** Looks like a merge is happening before/after reload -- but before test gets list of commits?
*** Possible from RandomMergePolicy?
* At this line where this test fails, a non-nightly run won't have indexed a single doc -- so this particular failure will only be observable with {{-Dtests.nightly=true}} ...{code}    int docs = TEST_NIGHTLY ? 200000 : 0;
{code}
* i don't understand the point of this test at all ... it doesn't compare anything between master/slave except after a commit -- so where does the "AfterCoreReload" part come into play?
** it's particularly wonky given that half of the asserts comparing master/slave are about haven an identical {{numFound=0}} for a {{\*:\*}} search against an empty index! (unless nightly)
{quote}

----

In response to some earlier comments here in SOLR-6286...

bq. ... I'd expect that since there were no pending changes, there's be no need to write a new segment. ...

That seems like a naive assumption given the randomized merge settings -- there could easily be background merges in other threads, or the randomized merge scheduler could decide to do an arbitrary/useless merge on commit (IIRC)

bq. ... This failure only happens sometimes. ...

Note my other comments above: this test is virtually useless unless you are running in nightly mode ... you'll never see this failure w/o it.

When nightly node is enabled, the seeds i've seen fail are reliable.

bq. The indexversion doesn't change though and the slave says that it is in sync with the master. 

I'm not sure what the test looked like when this comment was made, but as of right now on master this test never makes any attempt to verify the slave's replication status / indexversion after (master) core reload -- it only compares the slave numFound *before* the core reload, and then again *after* both the reload *AND* an additional master commit -- an assert the test never reaches when failures like those listed in this issue happen -- so it's impossible to reliably say the slave is in sync with the master at that point (let alone if it was in sync after the core reload, but but before the subsequent additional adds/commits)


> TestReplicationHandler.doTestReplicateAfterCoreReload failure on jenkins
> ------------------------------------------------------------------------
>
>                 Key: SOLR-6286
>                 URL: https://issues.apache.org/jira/browse/SOLR-6286
>             Project: Solr
>          Issue Type: Bug
>          Components: Tests
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>             Fix For: 4.10, 6.0
>
>
> There have been a few failures on jenkins.
> {code}
> 3 tests failed.
> REGRESSION:  org.apache.solr.handler.TestReplicationHandler.doTestReplicateAfterCoreReload
> Error Message:
> expected:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, _ypy.cfs, _ypy.si, segments_2]}]> but was:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, _ypy.cfs, _ypy.si, segments_2]}, {indexVersion=1406477990053,generation=3,filelist=[_bta.fdt, _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, _ypc.cfe, _ypc.cfs, _ypc.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, _ypy.cfs, _ypy.si, segments_3]}]>
> Stack Trace:
> java.lang.AssertionError: expected:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, _ypy.cfs, _ypy.si, segments_2]}]> but was:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, _ypy.cfs, _ypy.si, segments_2]}, {indexVersion=1406477990053,generation=3,filelist=[_bta.fdt, _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, _ypc.cfe, _ypc.cfs, _ypc.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, _ypy.cfs, _ypy.si, segments_3]}]>
>         at __randomizedtesting.SeedInfo.seed([E4FFCDCA8EC968BC:C128D6FAFE8166BF]:0)
>         at org.junit.Assert.fail(Assert.java:93)
>         at org.junit.Assert.failNotEquals(Assert.java:647)
>         at org.junit.Assert.assertEquals(Assert.java:128)
>         at org.junit.Assert.assertEquals(Assert.java:147)
>         at org.apache.solr.handler.TestReplicationHandler.doTestReplicateAfterCoreReload(TestReplicationHandler.java:1190)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {code}
> https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/585/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org