You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Karthick Sankarachary (JIRA)" <ji...@apache.org> on 2010/04/21 02:30:50 UTC

[jira] Created: (LUCENE-2406) Max Out Hit Limits To Max Doc

Max Out Hit Limits To Max Doc
-----------------------------

                 Key: LUCENE-2406
                 URL: https://issues.apache.org/jira/browse/LUCENE-2406
             Project: Lucene - Java
          Issue Type: Bug
    Affects Versions: 3.0
            Reporter: Karthick Sankarachary


Currently, the IndexSearcher lets you limit the number of hits that a search returns. Ironically, that option works against you if the limit is set to a very large number. In particular, during the initialization of the search process, the hit queue attempts to allocate as many document scores as the hit limit. Needless to say, this is bound to lead to out of memory issues in the event the hit limit passed by the user is very large.

This issue can be verified by setting the hit limit to the maximum integer value (please see test case attached herein.) Note that the test fails in the PriorityQueue#initialize method as it tries to increase the hit limit by one (to make room for a sentinel object), which causes an integer overflow (please see the stack trace attached herein.)

The root cause of this issue lies not in the priority queue, but in the index searcher itself. Ideally, it should ensure that the hit limit does not exceed it's maxDoc count, which typically is the maximum number of documents held in the underlying index reader. A patch that implements the above sanity check has been attached herein.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2406) Max Out Hit Limits To Max Doc

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated LUCENE-2406:
------------------------------------------

    Attachment: LUCENE-2406.patch
                LUCENE-2406-error.txt
                TestMaxHitLimit.java

> Max Out Hit Limits To Max Doc
> -----------------------------
>
>                 Key: LUCENE-2406
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2406
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 3.0
>            Reporter: Karthick Sankarachary
>         Attachments: LUCENE-2406-error.txt, LUCENE-2406.patch, TestMaxHitLimit.java
>
>
> Currently, the IndexSearcher lets you limit the number of hits that a search returns. Ironically, that option works against you if the limit is set to a very large number. In particular, during the initialization of the search process, the hit queue attempts to allocate as many document scores as the hit limit. Needless to say, this is bound to lead to out of memory issues in the event the hit limit passed by the user is very large.
> This issue can be verified by setting the hit limit to the maximum integer value (please see test case attached herein.) Note that the test fails in the PriorityQueue#initialize method as it tries to increase the hit limit by one (to make room for a sentinel object), which causes an integer overflow (please see the stack trace attached herein.)
> The root cause of this issue lies not in the priority queue, but in the index searcher itself. Ideally, it should ensure that the hit limit does not exceed it's maxDoc count, which typically is the maximum number of documents held in the underlying index reader. A patch that implements the above sanity check has been attached herein.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2406) Max Out Hit Limits To Max Doc

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860127#action_12860127 ] 

Karthick Sankarachary commented on LUCENE-2406:
-----------------------------------------------

Thanks! I guess this is my cue to move to the trunk.

> Max Out Hit Limits To Max Doc
> -----------------------------
>
>                 Key: LUCENE-2406
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2406
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 3.0
>            Reporter: Karthick Sankarachary
>         Attachments: LUCENE-2406-error.txt, LUCENE-2406.patch, TestMaxHitLimit.java
>
>
> Currently, the IndexSearcher lets you limit the number of hits that a search returns. Ironically, that option works against you if the limit is set to a very large number. In particular, during the initialization of the search process, the hit queue attempts to allocate as many document scores as the hit limit. Needless to say, this is bound to lead to out of memory issues in the event the hit limit passed by the user is very large.
> This issue can be verified by setting the hit limit to the maximum integer value (please see test case attached herein.) Note that the test fails in the PriorityQueue#initialize method as it tries to increase the hit limit by one (to make room for a sentinel object), which causes an integer overflow (please see the stack trace attached herein.)
> The root cause of this issue lies not in the priority queue, but in the index searcher itself. Ideally, it should ensure that the hit limit does not exceed it's maxDoc count, which typically is the maximum number of documents held in the underlying index reader. A patch that implements the above sanity check has been attached herein.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2406) Max Out Hit Limits To Max Doc

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-2406.
----------------------------------------

    Resolution: Duplicate

Thanks for the patch!

This has actually already been fixed, in trunk (LUCENE-2119).  Also note that you can use .numDocs() not .maxDoc().

> Max Out Hit Limits To Max Doc
> -----------------------------
>
>                 Key: LUCENE-2406
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2406
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 3.0
>            Reporter: Karthick Sankarachary
>         Attachments: LUCENE-2406-error.txt, LUCENE-2406.patch, TestMaxHitLimit.java
>
>
> Currently, the IndexSearcher lets you limit the number of hits that a search returns. Ironically, that option works against you if the limit is set to a very large number. In particular, during the initialization of the search process, the hit queue attempts to allocate as many document scores as the hit limit. Needless to say, this is bound to lead to out of memory issues in the event the hit limit passed by the user is very large.
> This issue can be verified by setting the hit limit to the maximum integer value (please see test case attached herein.) Note that the test fails in the PriorityQueue#initialize method as it tries to increase the hit limit by one (to make room for a sentinel object), which causes an integer overflow (please see the stack trace attached herein.)
> The root cause of this issue lies not in the priority queue, but in the index searcher itself. Ideally, it should ensure that the hit limit does not exceed it's maxDoc count, which typically is the maximum number of documents held in the underlying index reader. A patch that implements the above sanity check has been attached herein.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org