You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2017/10/19 20:58:00 UTC
[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

     [ https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-7997:
--------------------------------
    Attachment: LUCENE-7997_wip.patch

hacky patch with my current state. Spent a lot of time looking at reasonable state space, which is really hard since we don't have limits to number of documents, no bounds on boosts, etc. Tried really hard (maybe too much) to be super-fair to the similarity, e.g. test shouldn't generate scenarios that are impossible to create with IndexWriter. But some things (like huge tf values but tiny norm values) are fair game because we don't limit stacking terms/synonyms and so forth. This stuff may still have interesting test bugs if beasted enough.

Currently the test fails, it seems like our bm25 may "go backwards" for largish term freqs, looks like floating point issues to me. Haven't tried to debug that yet, other crabs to chase down first.

Can't really debug anything about this test until i think, we first force explain() to *exactly* match score() for a sim. I realize this is a PITA, but I think we need that and will look into that next.

Here is an example of test output for the "going backwards" example, where it fails the pairwise test but the explanation doesnt show it. Still need to improve this, so its really easy to write a one-line test method for any scenario, and so on.

{noformat}
[junit4:pickseed] Seed property 'tests.seed' already defined: CA6EF971C3E23AAF
   [junit4] <JUnit4> says ciao! Master seed: CA6EF971C3E23AAF
   [junit4] Executing 1 suite with 1 JVM.
   [junit4] 
   [junit4] Started J0 PID(16127@localhost).
   [junit4] Suite: org.apache.lucene.search.similarities.TestBM25Similarity
   [junit4]   1> 0.03627357 = score(doc=0,freq=113659.0 = prevFreq=113658
   [junit4]   1> ), product of:
   [junit4]   1>   0.016547536 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:
   [junit4]   1>     449.0 = docFreq
   [junit4]   1>     456.0 = docCount
   [junit4]   1>   2.1920826 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
   [junit4]   1>     113659.0 = prevFreq=113658
   [junit4]   1>     1.2 = parameter k1
   [junit4]   1>     0.75 = parameter b
   [junit4]   1>     2300.5593 = avgFieldLength
   [junit4]   1>     1048600.0 = fieldLength
   [junit4]   1> 
   [junit4]   1> 0.03627357 = score(doc=0,freq=113659.0 = currentFreq=113659
   [junit4]   1> ), product of:
   [junit4]   1>   0.016547536 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:
   [junit4]   1>     449.0 = docFreq
   [junit4]   1>     456.0 = docCount
   [junit4]   1>   2.1920826 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
   [junit4]   1>     113659.0 = currentFreq=113659
   [junit4]   1>     1.2 = parameter k1
   [junit4]   1>     0.75 = parameter b
   [junit4]   1>     2300.5593 = avgFieldLength
   [junit4]   1>     1048600.0 = fieldLength
   [junit4]   1> 
   [junit4]   1> BM25(k1=1.2,b=0.75)
   [junit4]   1> field="field",maxDoc=13938,docCount=456,sumTotalTermFreq=1049055,sumDocFreq=456
   [junit4]   1> term="term",docFreq=449,totalTermFreq=196765
   [junit4]   1> norm=168 (doc length ~ 1048600)
   [junit4]   1> freq=113659
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestBM25Similarity -Dtests.method=testRandomScoring -Dtests.seed=CA6EF971C3E23AAF -Dtests.locale=el -Dtests.timezone=Etc/GMT-13 -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 0.12s | TestBM25Similarity.testRandomScoring <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: score(113658)=0.036273565 > score(113659)=0.03627356
   [junit4]    > 	at __randomizedtesting.SeedInfo.seed([CA6EF971C3E23AAF:41F1A0C3D995DCA5]:0)
   [junit4]    > 	at org.apache.lucene.search.similarities.BaseSimilarityTestCase.doTestScoring(BaseSimilarityTestCase.java:324)
   [junit4]    > 	at org.apache.lucene.search.similarities.BaseSimilarityTestCase.testRandomScoring(BaseSimilarityTestCase.java:296)
   [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> NOTE: test params are: codec=CheapBastard, sim=RandomSimilarity(queryNorm=true): {field=DFR I(ne)3(800.0)}, locale=el, timezone=Etc/GMT-13
   [junit4]   2> NOTE: Linux 4.4.0-92-generic amd64/Oracle Corporation 1.8.0_45 (64-bit)/cpus=8,threads=1,free=134724456,total=189267968
   [junit4]   2> NOTE: All tests run in this JVM: [TestBM25Similarity]
   [junit4] Completed [1/1 (1!)] in 1.14s, 1 test, 1 failure <<< FAILURES!
{noformat}

> More sanity testing of similarities
> -----------------------------------
>
>                 Key: LUCENE-7997
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7997
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the similarity is an increasing functions of {{freq}} (all other things like DF and length being equal). This sounds like a very reasonable requirement for a similarity, so we should test it in the base similarity test case and maybe move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org