You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Steve Rowe (JIRA)" <ji...@apache.org> on 2018/05/24 15:15:00 UTC

[jira] [Commented] (LUCENE-7805) TestRandomChains.testRandomChainsWithLargeStrings() failures

    [ https://issues.apache.org/jira/browse/LUCENE-7805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489188#comment-16489188 ] 

Steve Rowe commented on LUCENE-7805:
------------------------------------

Another reproducing failure from [https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4654/]:

{noformat}
   [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
   [junit4]   2> TEST FAIL: useCharFilter=false text='\u0719\u4131\ud44b\u6c69\u0640 \u9930\udbd0\udd14\u1bc7  uwlq buso \u169a\u168c\u169d\u1697 \ud800\udd23 cJI \u17af\u17eb \ud800\udfad \u03ad\uf560z z b \uacad\nX   lnhsmof ypvaz dbh hihgk'
   [junit4]   2> Exception from random analyzer: 
   [junit4]   2> charfilters=
   [junit4]   2> tokenizer=
   [junit4]   2>   org.apache.lucene.analysis.standard.ClassicTokenizer()
   [junit4]   2> filters=
   [junit4]   2>   org.apache.lucene.analysis.MockSynonymFilter(ValidatingTokenFilter@2a0aa2a5 term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1)ConditionalTokenFilter: 
   [junit4]   2>   org.apache.lucene.analysis.shingle.ShingleFilter(OneTimeWrapper@7eb4cb78 term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1, 8)
   [junit4]   2>   org.apache.lucene.analysis.ar.ArabicStemFilter(ValidatingTokenFilter@2853f327 term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1,keyword=false)ConditionalTokenFilter: 
   [junit4]   2>   org.apache.lucene.analysis.shingle.ShingleFilter(OneTimeWrapper@32562616 term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1,keyword=false, vesj)
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestRandomChains -Dtests.method=testRandomChainsWithLargeStrings -Dtests.seed=4B2538DE118B4F1E -Dtests.slow=true -Dtests.locale=mk-MK -Dtests.timezone=Indian/Comoro -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.14s J1 | TestRandomChains.testRandomChainsWithLargeStrings <<<
   [junit4]    > Throwable #1: java.lang.IllegalStateException: last stage: inconsistent endOffset at pos=21: 67 vs 71; token=lnhsmof ypvaz dbh
   [junit4]    > 	at __randomizedtesting.SeedInfo.seed([4B2538DE118B4F1E:217E87CF48C56FED]:0)
   [junit4]    > 	at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:122)
   [junit4]    > 	at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:746)
   [junit4]    > 	at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:657)
   [junit4]    > 	at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:559)
   [junit4]    > 	at org.apache.lucene.analysis.core.TestRandomChains.testRandomChainsWithLargeStrings(TestRandomChains.java:882)
   [junit4]    > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]    > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]    > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]    > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
   [junit4]    > 	at java.base/java.lang.Thread.run(Thread.java:844)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): {dummy=BlockTreeOrds(blocksize=128)}, docValues:{}, maxPointsInLeafNode=1468, maxMBSortInHeap=6.910135125784358, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@6fc32a30), locale=mk-MK, timezone=Indian/Comoro
   [junit4]   2> NOTE: Mac OS X 10.11.6 x86_64/Oracle Corporation 9 (64-bit)/cpus=3,threads=1,free=128110608,total=218628096
{noformat}

> TestRandomChains.testRandomChainsWithLargeStrings() failures
> ------------------------------------------------------------
>
>                 Key: LUCENE-7805
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7805
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Steve Rowe
>            Priority: Major
>
> My Jenkins found a reproducing master seed, looks like FlattenGraphFilter is where the problem happens:
> {noformat}
> Checking out Revision 680f4d7fd378868254786107de92a894758f667c (refs/remotes/origin/master)
> [...]
>    [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
>    [junit4]   2> TEST FAIL: useCharFilter=false text='\u0003J\u522f  nwqbl  uwtps  ob zdyokom ){0'
>    [junit4]   2> Exception from random analyzer: 
>    [junit4]   2> charfilters=
>    [junit4]   2>   org.apache.lucene.analysis.charfilter.HTMLStripCharFilter(java.io.StringReader@3ab617ae, [])
>    [junit4]   2>   org.apache.lucene.analysis.charfilter.HTMLStripCharFilter(org.apache.lucene.analysis.charfilter.HTMLStripCharFilter@23e3c717)
>    [junit4]   2> tokenizer=
>    [junit4]   2>   org.apache.lucene.analysis.ngram.NGramTokenizer(9, 43)
>    [junit4]   2> filters=
>    [junit4]   2>   org.apache.lucene.analysis.miscellaneous.CodepointCountFilter(ValidatingTokenFilter@6b4708ea term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word, 33, 44)
>    [junit4]   2>   org.apache.lucene.analysis.shingle.ShingleFilter(ValidatingTokenFilter@5533fb25 term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word, <EMAIL>)
>    [junit4]   2>   org.apache.lucene.analysis.core.FlattenGraphFilter(ValidatingTokenFilter@4ef4c44 term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word)
>    [junit4]   2>   org.apache.lucene.analysis.miscellaneous.KeepWordFilter(ValidatingTokenFilter@15baa1c7 term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word, [akbnucwt, vrkwm, jtomhk, jxgmfalr])
>    [junit4]   2> offsetsAreCorrect=true
>    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestRandomChains -Dtests.method=testRandomChainsWithLargeStrings -Dtests.seed=E9460213902F2F82 -Dtests.slow=true -Dtests.locale=fi-FI -Dtests.timezone=Europe/Malta -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
>    [junit4] FAILURE 0.19s J7 | TestRandomChains.testRandomChainsWithLargeStrings <<<
>    [junit4]    > Throwable #1: java.lang.AssertionError: outputEndNode=3 vs inputTo=2
>    [junit4]    > 	at __randomizedtesting.SeedInfo.seed([E9460213902F2F82:831DBD02C9610F71]:0)
>    [junit4]    > 	at org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:335)
>    [junit4]    > 	at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:67)
>    [junit4]    > 	at org.apache.lucene.analysis.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:51)
>    [junit4]    > 	at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:67)
>    [junit4]    > 	at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:731)
>    [junit4]    > 	at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:642)
>    [junit4]    > 	at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:540)
>    [junit4]    > 	at org.apache.lucene.analysis.core.TestRandomChains.testRandomChainsWithLargeStrings(TestRandomChains.java:880)
>    [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
>    [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): {dummy=PostingsFormat(name=LuceneVarGapFixedInterval)}, docValues:{}, maxPointsInLeafNode=807, maxMBSortInHeap=5.007333045299232, sim=RandomSimilarity(queryNorm=true): {}, locale=fi-FI, timezone=Europe/Malta
>    [junit4]   2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 1.8.0_77 (64-bit)/cpus=16,threads=1,free=492062472,total=525336576
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org