You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2013/08/20 05:50:52 UTC

[jira] [Commented] (LUCENE-3849) position increments should be implemented by TokenStream.end()

    [ https://issues.apache.org/jira/browse/LUCENE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744671#comment-13744671 ] 

Robert Muir commented on LUCENE-3849:
-------------------------------------

Thanks for bringing this back to life Mike.

How did you deal with facets? Is this stuff out of date now that it no longer encodes with payloads? It was the big barrier to my previous patch: lots of tokenstreams never calling clearAtts at all (LUCENE-4318)

If its no longer a problem, lets mark that issue resolved.

Patch looks good to me, though it would be good to temporarily make end() final or something in TokenStream.java and review all tokenstreams that have an impl to make sure its ok. 

I still have my reservations about the posInc=0 stuff (i wish this was cleaner), but I don't see a better way: and this is really a bug we should fix.
                
> position increments should be implemented by TokenStream.end()
> --------------------------------------------------------------
>
>                 Key: LUCENE-3849
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3849
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 3.6, 4.0-ALPHA
>            Reporter: Robert Muir
>         Attachments: LUCENE-3849.patch, LUCENE-3849.patch, LUCENE-3849.patch, LUCENE-3849.patch
>
>
> if you have pages of a book as multivalued fields, with the default position increment gap
> of analyzer.java (0), phrase queries won't work across pages if one ends with stopword(s).
> This is because the 'trailing holes' are not taken into account in end(). So I think in
> TokenStream.end(), subclasses of FilteringTokenFilter (e.g. stopfilter) should do:
> {code}
> super.end();
> posIncAtt += skippedPositions;
> {code}
> One problem is that these filters need to 'add' to the posinc, but currently nothing clears
> the attributes for end() [they are dirty, except offset which is set by the tokenizer].
> Also the indexer should be changed to pull posIncAtt from end().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org