You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2018/06/01 17:46:00 UTC

[jira] [Commented] (LUCENE-8332) New ConcatenateGraphTokenStream (move/rename CompletionTokenStream)

    [ https://issues.apache.org/jira/browse/LUCENE-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498323#comment-16498323 ] 

David Smiley commented on LUCENE-8332:
--------------------------------------

Final patch from the GitHub PR.  Thanks [~jimczi] for your reviewing. I did make one change:  I removed the bug fix to TokenStreamToAutomaton as it deserved its own issue – LUCENE-8344.  I commented out the trailing stopword in org.apache.lucene.analysis.miscellaneous.TestConcatenateGraphFilter#testWithStopword so this test wouldn't fail, leaving a reference to the other JIRA.

It's kinda a shame the Git history may be less than ideal on ConcatenateGraphFilter since CompletionTokenStream will stay in a gutted form that delegates to it.  I could commit the new CTS in a second commit (first commit would be broken), and push the two together at once to ASF git servers.  Probably not worth the hassle though.

 

> New ConcatenateGraphTokenStream (move/rename CompletionTokenStream)
> -------------------------------------------------------------------
>
>                 Key: LUCENE-8332
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8332
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Major
>         Attachments: LUCENE-8332.patch
>
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Lets move and rename the CompletionTokenStream in the suggest module into the analysis module renamed as ConcatenateGraphTokenStream. See comments in LUCENE-8323 leading to this idea. Such a TokenStream (or TokenFilter?) has several uses:
>  * for the suggest module
>  * by the SolrTextTagger for NER/ERD use cases – SOLR-12376
>  * for doing complete match search efficiently
> It will need a factory – a TokenFilterFactory, even though we don't have a TokenFilter based subclass of TokenStream.
> It appears there is no back-compat concern in it suddenly disappearing from the suggest module as it's marked experimental and it only seems to be public now perhaps due to some technicality (it has package level constructors).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org