You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mike Sokolov (JIRA)" <ji...@apache.org> on 2018/04/25 14:42:00 UTC

[jira] [Comment Edited] (LUCENE-8273) Add a BypassingTokenFilter

    [ https://issues.apache.org/jira/browse/LUCENE-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452348#comment-16452348 ] 

Mike Sokolov edited comment on LUCENE-8273 at 4/25/18 2:41 PM:
---------------------------------------------------------------

Perhaps this can be extended to handle the case of ShingleFilter et al by tracking whether we are currently in a recursion. In the case of some filter that consumes multiple input tokens, it could call us multiple times while we are in DELEGATING, but if we remember that we called delegate.incrementToken() and it has not yet returned, then we should not recurse again, but should instead call input.incrementToken(). I haven't tried this, and my brain is getting contorted trying to hold all the cases, but I think it should work?


was (Author: sokolov):
Perhaps this can be extended to handle the case of ShingleFilter et al by tracking whether we are currently in a recursion. In the case of some filter that consumes multiple input tokens, it could call us multiple times while we are in DELEGATING, but if we remember that we called delegate.incrementToken() and it has not yet returned, then we should not recurse again, but should instead call input.incrementToken().


> Add a BypassingTokenFilter
> --------------------------
>
>                 Key: LUCENE-8273
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8273
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Alan Woodward
>            Priority: Major
>         Attachments: LUCENE-8273.patch, LUCENE-8273.patch
>
>
> Spinoff of LUCENE-8265.  It would be useful to be able to wrap a TokenFilter in such a way that it could optionally be bypassed based on the current state of the TokenStream.  This could be used to, for example, only apply WordDelimiterFilter to terms that contain hyphens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org