You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Trejkaz (JIRA)" <ji...@apache.org> on 2011/08/10 08:11:27 UTC

[jira] [Created] (LUCENE-3370) Support for a "SpanNotNearQuery"

Support for a "SpanNotNearQuery"
--------------------------------

                 Key: LUCENE-3370
                 URL: https://issues.apache.org/jira/browse/LUCENE-3370
             Project: Lucene - Java
          Issue Type: New Feature
          Components: core/search
            Reporter: Trejkaz


Sometimes you want to find an instance of a span which does not hit near some other span query.  SpanNotQuery only excludes exact hits on the term, but sometimes you want to exclude hits 1 away from the first, and other times you might want the range to be wider.

So a SpanNotNearQuery could be useful.  

SpanNotQuery is actually very close, and adding slop+inOrder support to it is probably sufficient to make a SpanNotNearQuery. :)

There appears to be one project which has done it in this fashion, although this particular code looks like it's out of date:

http://www.koders.com/java/fid933A84488EBE1F3492B19DE01B2A4FC1D68DA258.aspx?s=ArrayQuery


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3370) Support for a "SpanNotNearQuery"

Posted by "Trejkaz (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163307#comment-13163307 ] 

Trejkaz commented on LUCENE-3370:
---------------------------------

Well, I ran with a modified version of SpanNotQuery for some time and nobody noticed any issues with it, but I just found the one thing which SpanNotQuery does differently from SpanNearQuery which makes it unsuitable for this task.

With a SpanNearQuery, if you have "cat" in the document only once, and you search for span-near("cat","cat"), you will get no hits.  It doesn't regard terms as being "near" themselves.

However with a SpanNotQuery, if you have "cat" in the document only once, and you search for span-not("cat","cat"), you *also* get no hits, because you have subtracted all the spans you got in the first round.

Since SpanNotNearQuery works like an expanded SpanNotQuery, it inherits this behaviour.  Thus, SpanNearQuery and SpanNotNearQuery end up in a situation where, quite confusingly to someone who doesn't know how they work, the results when added together for some reason do not give the full set of spans you would have had before applying the additional query.

                
> Support for a "SpanNotNearQuery"
> --------------------------------
>
>                 Key: LUCENE-3370
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3370
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/search
>            Reporter: Trejkaz
>
> Sometimes you want to find an instance of a span which does not hit near some other span query.  SpanNotQuery only excludes exact hits on the term, but sometimes you want to exclude hits 1 away from the first, and other times you might want the range to be wider.
> So a SpanNotNearQuery could be useful.  
> SpanNotQuery is actually very close, and adding slop+inOrder support to it is probably sufficient to make a SpanNotNearQuery. :)
> There appears to be one project which has done it in this fashion, although this particular code looks like it's out of date:
> http://www.koders.com/java/fid933A84488EBE1F3492B19DE01B2A4FC1D68DA258.aspx?s=ArrayQuery

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3370) Support for a "SpanNotNearQuery"

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082606#comment-13082606 ] 

Hoss Man commented on LUCENE-3370:
----------------------------------

bq. SpanNotQuery is actually very close, and adding slop+inOrder support to it is probably sufficient to make a SpanNotNearQuery. 

A more general solution would probably be something like a "SpanPaddingQuery(final SpanQuery inner, final int startPad, final int endPad)" ... where the Spans produced by an instance would be all of the Spans of the nner query wrapped so that their start/end where decremented/incremented by the startPad/endPad values.

That should be fairly trivial to implement, and would then let you implement the logic you are talking about using something like "new SpanNotQuery(a, new SpanPaddingQuery(b, slop, slop)"

> Support for a "SpanNotNearQuery"
> --------------------------------
>
>                 Key: LUCENE-3370
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3370
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/search
>            Reporter: Trejkaz
>
> Sometimes you want to find an instance of a span which does not hit near some other span query.  SpanNotQuery only excludes exact hits on the term, but sometimes you want to exclude hits 1 away from the first, and other times you might want the range to be wider.
> So a SpanNotNearQuery could be useful.  
> SpanNotQuery is actually very close, and adding slop+inOrder support to it is probably sufficient to make a SpanNotNearQuery. :)
> There appears to be one project which has done it in this fashion, although this particular code looks like it's out of date:
> http://www.koders.com/java/fid933A84488EBE1F3492B19DE01B2A4FC1D68DA258.aspx?s=ArrayQuery

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3370) Support for a "SpanNotNearQuery"

Posted by "Trejkaz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082708#comment-13082708 ] 

Trejkaz commented on LUCENE-3370:
---------------------------------

That's not a bad idea.

Some care should be taken though - the padding would work more logically, so to convert from slop (which doesn't!) you would have to add 1 to the slop value to get the padding value.



> Support for a "SpanNotNearQuery"
> --------------------------------
>
>                 Key: LUCENE-3370
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3370
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/search
>            Reporter: Trejkaz
>
> Sometimes you want to find an instance of a span which does not hit near some other span query.  SpanNotQuery only excludes exact hits on the term, but sometimes you want to exclude hits 1 away from the first, and other times you might want the range to be wider.
> So a SpanNotNearQuery could be useful.  
> SpanNotQuery is actually very close, and adding slop+inOrder support to it is probably sufficient to make a SpanNotNearQuery. :)
> There appears to be one project which has done it in this fashion, although this particular code looks like it's out of date:
> http://www.koders.com/java/fid933A84488EBE1F3492B19DE01B2A4FC1D68DA258.aspx?s=ArrayQuery

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org