You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2011/06/22 00:13:47 UTC

[jira] [Created] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Optimize TermsEnum.seek when caller doesn't need next term
----------------------------------------------------------

                 Key: LUCENE-3225
                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
             Project: Lucene - Java
          Issue Type: Improvement
            Reporter: Michael McCandless
            Assignee: Michael McCandless
             Fix For: 4.0


Some codecs are able to save CPU if the caller is only interested in
exact matches.  EG, Memory codec and SimpleText can do more efficient
FSTEnum lookup if they know the caller doesn't need to know the term
following the seek term.

We have cases like this in Lucene, eg when IW deletes documents by
Term, if the term is not found in a given segment then it doesn't need
to know the ceiling term.  Likewise when TermQuery looks up the term
in each segment.

I had done this change as part of LUCENE-3030, which is a new terms
index that's able to save seeking for exact-only lookups, but now that
we have Memory codec that can also save CPU I think we should commit
this today.

The change adds a "boolean onlyExact" param to seek(BytesRef).


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053964#comment-13053964 ] 

Michael McCandless commented on LUCENE-3225:
--------------------------------------------

bq. Yet the other should the be seekFloor no?

Ahhh right, we had discussed on the dev list.  I agree!

But, we should do this in another issue.  Though, I think we should rename the current seek to seekCeil; I'll do that here.

> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Resolved] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-3225.
----------------------------------------

    Resolution: Fixed

> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch, LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054364#comment-13054364 ] 

Simon Willnauer commented on LUCENE-3225:
-----------------------------------------

looks good +1 to commit! thanks for working on that

> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch, LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053844#comment-13053844 ] 

Michael McCandless commented on LUCENE-3225:
--------------------------------------------

bq. Mike this seems like a good improvement but I think letting a user change the behavior of method X by passing true / false to method Y is no good. I think this is kind of error prone plus its cluttering the seek method though. Once Boolean is enough here. I think we should rather restrict this to allow users to pull an exactMatchOnly TermsEnum which does only support exact matches and throws a clear exception if next is called. I know that makes things slightly harder especially to deal with our ThreadLocal cached TermsEnum instances but I think that is better here.

Well, it only means the enum is unpositioned if you get back
NOT_FOUND?  Ie, it's just like if you get back null from next(), or
END from seek(): in these cases, the enum is unpositioned and you need
to call seek again.

My worry if we force an up-front decision here ("exact only" enum vs
"non-exact only enum") is we prevent legitimate use cases where the
caller wants to mix & match with one enum.

EG, when AutomatonQuery intersects w/ the terms, when it hits are
region where terms are denser than what the automaton will accept
(such as an "infinite" part), it should use exact seeking, but then
when it's in a region where terms are less dense (eg a "finite" part)
it should use exact seeking.... I'll open a separate issue for this.

The TermsEnum impls can be efficient in this case, ie re-using
internal seek state for the exat and non-exact cases (MemoryCodec does
this).

But I agree another boolean to seek isn't great; maybe instead we can
make a seperate seekExact method?  Default impl would just call seek
(and get no perf gains).

BTW, similarly, I think we have a missing API in DISI (for
scoring): advance always does a next() if the target doc doesn't
match.  But we can get substantial performance gains in some cases
(see LUCENE-1536) if we had an advanceExact that would not do the
next and simply tell us if this doc matched or not.

bq. Can we somehow leave the extra CPU work to the term() call and make this entirely lazy?

Not sure what you meant here?


> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053603#comment-13053603 ] 

Yonik Seeley commented on LUCENE-3225:
--------------------------------------

+1, looks good.

> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053885#comment-13053885 ] 

Simon Willnauer commented on LUCENE-3225:
-----------------------------------------

{quote}
BTW, similarly, I think we have a missing API in DISI (for
scoring): advance always does a next() if the target doc doesn't
match. But we can get substantial performance gains in some cases
(see LUCENE-1536) if we had an advanceExact that would not do the
next and simply tell us if this doc matched or not.
{quote}

+1!!
{quote}
But I agree another boolean to seek isn't great; maybe instead we can
make a seperate seekExact method? Default impl would just call seek
(and get no perf gains).
{quote}

thats another option and I like that better though. Yet the other should the be seekFloor no?

bq. not sure what you meant here?

nevermind I only looked at the top of the patch and figured that we only safe the loading into bytesref but there is more about it...



> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Updated] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3225:
---------------------------------------

    Attachment: LUCENE-3225.patch

OK, new patch: I added a new seekExact method (instead of new boolean to seek); renamed existing seek methods to either seekCeil or seekExact; changed seekExact(long ord) to not return a value (it's an error to pass out-of-bounds ord to this method).  I think it's ready!

> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch, LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053675#comment-13053675 ] 

Simon Willnauer commented on LUCENE-3225:
-----------------------------------------

Mike this seems like a good improvement but I think letting a user change the behavior of method X by passing true / false to method Y is no good. I think this is kind of error prone plus its cluttering the seek method though. Once Boolean is enough here. I think we should rather restrict this to allow users to pull an exactMatchOnly TermsEnum which does only support exact matches and throws a clear exception if next is called. I know that makes things slightly harder especially to deal with our ThreadLocal cached TermsEnum instances but I think that is better here. Can we somehow leave the extra CPU work to the term() call and make this entirely lazy?


> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Dawid Weiss (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054360#comment-13054360 ] 

Dawid Weiss commented on LUCENE-3225:
-------------------------------------

I like this one better. boolean args are cryptic (even if I do use them from time to time).

> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch, LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Updated] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3225:
---------------------------------------

    Attachment: LUCENE-3225.patch

Patch.  All tests pass... I think it's ready!

> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053846#comment-13053846 ] 

Michael McCandless commented on LUCENE-3225:
--------------------------------------------

This patch gives nice gains for MemoryCodec: I did a quick test w/ my
NRT stress test (reopen at 2X Twitter's peak indexing rate) and the
reopen time dropped from ~49 msec to ~43 msec (~12% faster).  This is
impressive because resolving deletes is just one part of opening the
NRT reader, ie we also must write the new segment, open SegmentReader
against it, etc.


> Optimize TermsEnum.seek when caller doesn't need next term
> ----------------------------------------------------------
>
>                 Key: LUCENE-3225
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3225
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3225.patch
>
>
> Some codecs are able to save CPU if the caller is only interested in
> exact matches.  EG, Memory codec and SimpleText can do more efficient
> FSTEnum lookup if they know the caller doesn't need to know the term
> following the seek term.
> We have cases like this in Lucene, eg when IW deletes documents by
> Term, if the term is not found in a given segment then it doesn't need
> to know the ceiling term.  Likewise when TermQuery looks up the term
> in each segment.
> I had done this change as part of LUCENE-3030, which is a new terms
> index that's able to save seeking for exact-only lookups, but now that
> we have Memory codec that can also save CPU I think we should commit
> this today.
> The change adds a "boolean onlyExact" param to seek(BytesRef).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org