You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2010/09/26 19:58:32 UTC

[jira] Created: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
-----------------------------------------------------------------

                 Key: LUCENE-2669
                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Search
            Reporter: Michael McCandless
             Fix For: 3.1, 4.0


Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).

However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!

Other MTQs seem not to trip it.

I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2669:
----------------------------------

    Attachment: LUCENE-2669.patch

Slightly more readable patch: for(;;)-loop removed and so first if check in accept() negated and used as while-clause instead

Mike you set this as fix 3.1 and 4.0, but 3.1 does not have FilteredTermsEnum. We cannot fix it there easily, as it uses the old style logic from 3.0.

> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2669.patch, LUCENE-2669.patch, LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2669:
----------------------------------

    Attachment: LUCENE-2669.patch

Here a patch that fixed NRTE to only seek forward. This should also improve NRQ's perf in trunk.

It works like the following:

# nextSeekTerm() checks that the next range already fits the *current* term. If not it forwards to the next sub-range and returns a seek term that is at least greater or equal the *current* term
# accept() checks for the non-hit case (seldom as for a NRQ most terms are hits until the upper sub-range-bound is reached), if the next sub-range lower bound term on the stack is greater that the *current* one, and only then returns NO_AND_SEEK. If this is not the case, it does not seek but instead only move forward to the next sub-range and repeats the bounds checks [for(;;) loop].

> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2669.patch, LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-2669:
---------------------------------------

    Attachment: LUCENE-2669.patch

Patch w/ 2 asserts.  NRQ only trips up on the first (FilteredTermsEnum) assert.  That it doesn't trip the 2nd shows that indeed its seek ranges are properly sorted...

> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Assigned: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler reassigned LUCENE-2669:
-------------------------------------

    Assignee: Uwe Schindler

> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915031#action_12915031 ] 

Michael McCandless commented on LUCENE-2669:
--------------------------------------------

bq. Mike you set this as fix 3.1 and 4.0, but 3.1 does not have FilteredTermsEnum. We cannot fix it there easily, as it uses the old style logic from 3.0.

Woops, right -- I'll fix to 4.0 only.

> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2669.patch, LUCENE-2669.patch, LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915013#action_12915013 ] 

Robert Muir commented on LUCENE-2669:
-------------------------------------

This is a good catch: NRQ should play ping-pong to avoid these unnecessary seeks :)


> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915243#action_12915243 ] 

Michael McCandless commented on LUCENE-2669:
--------------------------------------------

Sweet, that was fast -- thanks Uwe!

> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2669.patch, LUCENE-2669.patch, LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-2669:
---------------------------------------

    Fix Version/s:     (was: 3.1)

> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2669.patch, LUCENE-2669.patch, LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915032#action_12915032 ] 

Uwe Schindler commented on LUCENE-2669:
---------------------------------------

bq. Mike you set this as fix 3.1 and 4.0, but 3.1 does not have FilteredTermsEnum. We cannot fix it there easily, as it uses the old style logic from 3.0.

We can maybe fix this also in 3.0 and not fetch a new enum, when the same conditions apply. But its totally different code, will do that in a separate patch, if its easy (the 3.0/3.1 enum is complicated...)

> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2669.patch, LUCENE-2669.patch, LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2669) NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler resolved LUCENE-2669.
-----------------------------------

    Resolution: Fixed

Committed revision: 1001582

Thanks Mike for catching this!

> NumericRangeQuery.NumericRangeTermsEnum sometimes seeks backwards
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2669
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2669.patch, LUCENE-2669.patch, LUCENE-2669.patch
>
>
> Subclasses of FilteredTermsEnum are "supposed to" seek forwards only (this gives better performance, typically).
> However, we don't check for this, so I added an assert to do that (while digging into testing the SimpleText codec) and NumericRangeQuery trips the assert!
> Other MTQs seem not to trip it.
> I think I know what's happening -- say NRQ has term ranges a-c, e-f to seek to, but then while it's .next()'ing through the first range, the first term after c is f.  At this point NRQ sees the range a-c is done, and then tries to seek to term e which is before f.  Maybe NRQ's accept method should detect this case (where you've accidentally .next()'d into or possibly beyond the next one or more seek ranges)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org