You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris M. Hostetter (Jira)" <ji...@apache.org> on 2019/11/19 22:20:00 UTC

[jira] [Commented] (SOLR-13946) SpellCheckCollatorTest.testEstimatedHitCounts reproducing failure seed

    [ https://issues.apache.org/jira/browse/SOLR-13946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977881#comment-16977881 ] 

Chris M. Hostetter commented on SOLR-13946:
-------------------------------------------

Background...

Jenkins recently identified a reproducing seed failure for SpellCheckCollatorTest.testEstimatedHitCounts on branch_8_3 that reproduces on branch_8x...
{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.seed=AFA731DEE618DA14 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.badapples=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-8.3/test-data/enwiki.random.lines.txt -Dtests.locale=ga-IE -Dtests.timezone=Canada/Atlantic -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.16s J2 | SpellCheckCollatorTest.testEstimatedHitCounts <<<
   [junit4]    > Throwable #1: java.lang.RuntimeException: Exception during query
   [junit4]    >        at __randomizedtesting.SeedInfo.seed([AFA731DEE618DA14:9E1C8FEB4327CAC4]:0)
   [junit4]    >        at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:998)
   [junit4]    >        at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:958)
   [junit4]    >        at org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts(SpellCheckCollatorTest.java:569)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
   [junit4]    > Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=//lst[@name='spellcheck']/lst[@name='collations']/lst[@name='collation']/long[@name='hits' and 3 <= . and . <= 13]
   [junit4]    >        xml response was: <?xml version="1.0" encoding="UTF-8"?>
   [junit4]    > <response>
   [junit4]    > <lst name="responseHeader"><int name="status">0</int><int name="QTime">2</int></lst><result name="response" numFound="0" start="0"></result><lst name="spellcheck"><lst name="suggestions"><lst name="everother"><int name="numFound">1</int><int name="startOffset">9</int><int name="endOffset">18</int><arr name="suggestion"><str>everyother</str></arr></lst></lst><lst name="collations"><lst name="collation"><str name="collationQuery">teststop:everyother</str><long name="hits">14</long><lst name="misspellingsAndCorrections"><str name="everother">everyother</str></lst></lst></lst></lst>
   [junit4]    > </response>
   [junit4]    >        request was:spellcheck=true&spellcheck.dictionary=direct&spellcheck.count=1&spellcheck.collate=true&spellcheck.maxCollationTries=1&spellcheck.maxCollations=1&spellcheck.collateExtendedResults=true&qt=/spellCheckCompRH&q=teststop:everother&spellcheck.collateMaxCollectDocs=5
   [junit4]    >        at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:991)
   [junit4]    >        ... 41 more
{noformat}
...that seed did not reproduce on branch_8_0, so i attempted to use "git bisect" to idenfity when exactly it started to fail, but that lead to a false end (651f41e21bd3df98f70d2673295db29506e3d2e6) where a new variable was introduced in RandomCodec skewing the results. For completeness, i manually tried overriding the random FSTLoadMode in RandomCodec, after consuming hte same amount of randomness, and got a failure from this seed for all possible values – supporting the theory that this bug existed prior to this commit, it just doesn't manifest with the same seed prior to this.

Going one commit prior to when the randomization changed (80a6590c52b) I then attempted to beast this test w/diff seeds to try and find a new permutation that would fail...
{noformat}
ant beast -Dbeast.iters=1000  -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.badapples=true  -Dtests.locale=et-EE -Dtests.timezone=Australia/Canberra -Dtests.asserts=true -Dtests.file.encoding=UTF-8
{noformat}
...and found this...
{noformat}
  [beaster]   2> 5212 INFO  (TEST-SpellCheckCollatorTest.testEstimatedHitCounts-seed#[989A055D5EA05EC3]) [    ] o.a.s.SolrTestCaseJ4 ###Ending testEstimatedHitCounts
  [beaster]   2> NOTE: reproduce with: ant test  -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.seed=989A055D5EA05EC3 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=et-EE -Dtests.timezone=Australia/Canberra -Dtests.asserts=true -Dtests.file.encoding=UTF-8
  [beaster] [14:21:03.465] ERROR   0.40s | SpellCheckCollatorTest.testEstimatedHitCounts <<<
  [beaster]    > Throwable #1: java.lang.RuntimeException: Exception during query
  [beaster]    > 	at __randomizedtesting.SeedInfo.seed([989A055D5EA05EC3:A921BB68FB9F4E13]:0)
  [beaster]    > 	at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:934)
  [beaster]    > 	at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:894)
  [beaster]    > 	at org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts(SpellCheckCollatorTest.java:569)
...
  [beaster]    > Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=//lst[@name='spellcheck']/lst[@name='collations']/lst[@name='collation']/long[@name='hits' and 3 <= . and . <= 13]
  [beaster]    > 	xml response was: <?xml version="1.0" encoding="UTF-8"?>
  [beaster]    > <response>
  [beaster]    > <lst name="responseHeader"><int name="status">0</int><int name="QTime">3</int></lst><result name="response" numFound="0" start="0"></result><lst name="spellcheck"><lst name="suggestions"><lst name="everother"><int name="numFound">1</int><int name="startOffset">9</int><int name="endOffset">18</int><arr name="suggestion"><str>everyother</str></arr></lst></lst><lst name="collations"><lst name="collation"><str name="collationQuery">teststop:everyother</str><long name="hits">14</long><lst name="misspellingsAndCorrections"><str name="everother">everyother</str></lst></lst></lst></lst>
  [beaster]    > </response>
  [beaster]    > 
  [beaster]    > 	request was:spellcheck=true&spellcheck.dictionary=direct&spellcheck.count=1&spellcheck.collate=true&spellcheck.maxCollationTries=1&spellcheck.maxCollations=1&spellcheck.collateExtendedResults=true&qt=/spellCheckCompRH&q=teststop:everother&spellcheck.collateMaxCollectDocs=5
  [beaster]    > 	at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:927)
  [beaster]    > 	... 41 more
{noformat}
A quick check confirmed that this seed reproduced even as far back as branch_8_0 (93d1e67886e75623f3f72526284bdbc3b1fef7e6), but did *NOT* reproduce when this assert was last (substantively) modified (213a2a1791e4557afd2542a25e94eec65e29a42d) as part of SOLR-5344 (in 2016)

So back to git bisect...
{noformat}
93d1e67886e75623f3f72526284bdbc3b1fef7e6 is the first bad commit
commit 93d1e67886e75623f3f72526284bdbc3b1fef7e6
Author: jimczi <ji...@apache.org>
Date:   Mon Jan 28 22:43:20 2019 +0100

    LUCENE-8660: TopDocsCollectors now return an accurate count (instead of a lower bound) if the total hit count is equal to the provided threshold.

:040000 040000 7bd5d8ab9e691e1253979cf8698fa88b2c7b9fb1 ec6791f9fc718f27eb7ffd9451bff3595e25250b M	lucene
bisect run success
{noformat}
...I'm not really clear on how this change could affect ths test given that AFAIK every usage of TopDocsCollectors in solr _should_ be insisting on an exact count of total hits (ie threshold=MAX_INT) but since this commit has no modifications to the test-framework randomness that could otherwise impact this test, the fact that it's first failing seed means *something* about these changes impacted the underlying code being tested.

> SpellCheckCollatorTest.testEstimatedHitCounts reproducing failure seed
> ----------------------------------------------------------------------
>
>                 Key: SOLR-13946
>                 URL: https://issues.apache.org/jira/browse/SOLR-13946
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Priority: Major
>
> The following seed reliably fails on branch_8x.  based on the results of git bisect, it appears that this in some way relates to changes made in LUCENE-8660 (TopDocsCollectors's accuracy in reporting totalHits when dealing with totalHitsThreshold) even though all usage in sol should be requesting {{totalHitsThreshold=Integer.MAX_VALUE}}
> {noformat}
>    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=SpellCheckCollatorTest -Dtests.method=testEstimatedHitCounts -Dtests.seed=AFA731DEE618DA14 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=ga-IE -Dtests.timezone=Canada/Atlantic -Dtests.asserts=true -Dtests.file.encoding=UTF-8
>    [junit4] ERROR   0.36s | SpellCheckCollatorTest.testEstimatedHitCounts <<<
>    [junit4]    > Throwable #1: java.lang.RuntimeException: Exception during query
>    [junit4]    > 	at __randomizedtesting.SeedInfo.seed([AFA731DEE618DA14:9E1C8FEB4327CAC4]:0)
>    [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:1001)
>    [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:961)
>    [junit4]    > 	at org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts(SpellCheckCollatorTest.java:569)
>    [junit4]    > 	at java.lang.Thread.run(Thread.java:748)
>    [junit4]    > Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=//lst[@name='spellcheck']/lst[@name='collations']/lst[@name='collation']/long[@name='hits' and 3 <= . and . <= 13]
>    [junit4]    > 	xml response was: <?xml version="1.0" encoding="UTF-8"?>
>    [junit4]    > <response>
>    [junit4]    > <lst name="responseHeader"><int name="status">0</int><int name="QTime">2</int></lst><result name="response" numFound="0" start="0"></result><lst name="spellcheck"><lst name="suggestions"><lst name="everother"><int name="numFound">1</int><int name="startOffset">9</int><int name="endOffset">18</int><arr name="suggestion"><str>everyother</str></arr></lst></lst><lst name="collations"><lst name="collation"><str name="collationQuery">teststop:everyother</str><long name="hits">14</long><lst name="misspellingsAndCorrections"><str name="everother">everyother</str></lst></lst></lst></lst>
>    [junit4]    > </response>
>    [junit4]    > 	request was:spellcheck=true&spellcheck.dictionary=direct&spellcheck.count=1&spellcheck.collate=true&spellcheck.maxCollationTries=1&spellcheck.maxCollations=1&spellcheck.collateExtendedResults=true&qt=/spellCheckCompRH&q=teststop:everother&spellcheck.collateMaxCollectDocs=5
>    [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:994)
>    [junit4]    > 	... 41 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org