You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Alan Woodward (JIRA)" <ji...@apache.org> on 2018/06/18 13:14:00 UTC

[jira] [Commented] (LUCENE-8361) Make TestRandomChains check that filters preserve positions

    [ https://issues.apache.org/jira/browse/LUCENE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515679#comment-16515679 ] 

Alan Woodward commented on LUCENE-8361:
---------------------------------------

Here's a patch that adds a step to {{BaseTokenStreamTestCase.checkAnalysisConsistency}}.

I've already found a failure in MinHashTokenFilter, which raises the question of whether we still expect end() to report the total number of original tokens for filters that summarise the entire stream - the same will apply to FingerprintFilter and ConcatenateGraphFilter.  I'm minded to just exclude filter chains that contain any of these from the test.

> Make TestRandomChains check that filters preserve positions
> -----------------------------------------------------------
>
>                 Key: LUCENE-8361
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8361
>             Project: Lucene - Core
>          Issue Type: Test
>            Reporter: Adrien Grand
>            Assignee: Alan Woodward
>            Priority: Minor
>         Attachments: LUCENE-8361.patch
>
>
> Follow-up of LUCENE-8360: it is a bit disappointing that we only found this issue because of a newly introduced token filter. I'm wondering that we might be able to make TestRandomChains detect more bugs by verifying that the sum of position increments is preserved through the whole analysis chain.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org