You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2011/11/12 01:22:51 UTC

[jira] [Issue Comment Edited] (LUCENE-3533) Nuke SpanFilters and CachingSpanFilter (maybe move to sandbox)

    [ https://issues.apache.org/jira/browse/LUCENE-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148888#comment-13148888 ] 

Robert Muir edited comment on LUCENE-3533 at 11/12/11 12:22 AM:
----------------------------------------------------------------

It would be good to get a review on the patch: I think its ok in general.

it removes a lot of stupidity from the spans, except for one case:

the SpanMultiTermQueryWrapper is still not single pass (it simply throws all termcontexts away).

I thought about how to solve that one too, and I'm convinced its unfixable
because SpanQueries aren't really query trees, its just one query that
calls extractTerms on everything underneath it.

For this reason, even if i made this MTQ one single-pass by allowing TermContexts
to be passed to e.g. SpanOrQuery, it would work, but if you had that query inside
another SpanQuery then it would still do the extra seek like it does now.

But still, with the patch spans are a little better.
                
      was (Author: rcmuir):
    It would be good to get a review on the patch: I think its ok in general.

it removes a lot of stupidity from the spans, except for one case:

the SpanMultiTermQueryWrapper is still not single pass (it simply throws all termcontexts away).

I thought about how to solve that one too, and I'm convinced its unfixable
because SpanQueries aren't really query trees, its just one query that
calls rewriteTerms on everything underneath it.

For this reason, even if i made this MTQ one single-pass by allowing TermContexts
to be passed to e.g. SpanOrQuery, it would work, but if you had that query inside
another SpanQuery then it would still do the extra seek like it does now.

But still, with the patch spans are a little better.
                  
> Nuke SpanFilters and CachingSpanFilter (maybe move to sandbox)
> --------------------------------------------------------------
>
>                 Key: LUCENE-3533
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3533
>             Project: Lucene - Java
>          Issue Type: Task
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-3533.patch
>
>
> SpanFilters are inefficient and OOM easily (they don't scale at all: Create large Lists of Objects for every match, also filtering deleted docs is a pain). Some talks with Grant on Eurocon and also the fact that caching of them is still broken in 3.x (but fixed on trunk) - I assume nobody uses them, so let's nuke them. They are also in wrong package, so standard statement: "Die, SpanFilters, die!"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org