You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Paul Taylor <pa...@fastmail.fm> on 2009/12/02 14:34:19 UTC

java.lang.NegativeArraySizeException on searching using Integer.MAX_VALUE for number of hits

Hi, just upgraded my code to Lucene 3.0 and on one simple search I get 
the following stacktrace when I pass Integer.MAX_VALUE to the 
Searcher.search(Query query,int n) method, if I change the value to 1000 
it works okay.


java.lang.NegativeArraySizeException
    at 
org.apache.lucene.util.PriorityQueue.initialize(PriorityQueue.java:90)
    at org.apache.lucene.search.HitQueue.<init>(HitQueue.java:67)
    at 
org.apache.lucene.search.TopScoreDocCollector.<init>(TopScoreDocCollector.java:117)
    at 
org.apache.lucene.search.TopScoreDocCollector.<init>(TopScoreDocCollector.java:37)
    at 
org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.<init>(TopScoreDocCollector.java:42)
    at 
org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.<init>(TopScoreDocCollector.java:40)
    at 
org.apache.lucene.search.TopScoreDocCollector.create(TopScoreDocCollector.java:104)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:167)
    at org.apache.lucene.search.Searcher.search(Searcher.java:98)
    at org.apache.lucene.search.Searcher.search(Searcher.java:108)

Now I know I should specify a max hits value, but I really want to 
return all matches , and regardless it shouldn't throw this exception, 
and in other search code which also has specifies Integer.MAX_VALUE the 
exception is not occurring.


thanks Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: java.lang.NegativeArraySizeException on searching using Integer.MAX_VALUE for number of hits

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Wed, Dec 2, 2009 at 10:18 AM, Paul Taylor <pa...@fastmail.fm> wrote:
> Uwe Schindler wrote:
>>
>> If you want to have all results, you do something wrong. :-)
>>
>> Full text engines like lucene are made for returning only top-ranking
>> results. So if you use TopDocs results you must know before how many
>> TopDocs
>> you want to have. Internally Lucene works with PriorityQueues that filter
>> the top ranking results.
>>
>> If you want to have all results, you should not sort them by ranking,
>> which
>> is not needed then). In this case, implement an own Collector and collect
>> the results for yourself into e.g. ArrayLists and so on (but they are
>> unsorted then).
>>
>> Another possibility (if you really need the docs in relevance order) is to
>> run the search 2 times:
>> First only collect the top n=100 results. The TopDocs instance also
>> returns
>> the max results. Using that number you can re-run the query to get all
>> ranked results - but this is generally a bad approach, because PQs get
>> slower for too many results, where order is not relevant.
>
> Yes thats all fine and is my long term approach, but  isn't there a bug
> here, (in the case that is  failing there is only one actual result ), you
> haven't really addressed why I am geting this exception

This is certainly confusing (that you hit a NegativeArraySizeException
on passing MAX_INT); it's because the PQ takes 1+ the size you pass in
(which wraps int around to -1).  I suppose we could bounds check that
in PQ, though if we do that very likely the next thing you hit is
OOME, which would be the better exception here.  I'll open an issue.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: java.lang.NegativeArraySizeException on searching using Integer.MAX_VALUE for number of hits

Posted by Paul Taylor <pa...@fastmail.fm>.
Uwe Schindler wrote:
> If you want to have all results, you do something wrong. :-)
>
> Full text engines like lucene are made for returning only top-ranking
> results. So if you use TopDocs results you must know before how many TopDocs
> you want to have. Internally Lucene works with PriorityQueues that filter
> the top ranking results.
>
> If you want to have all results, you should not sort them by ranking, which
> is not needed then). In this case, implement an own Collector and collect
> the results for yourself into e.g. ArrayLists and so on (but they are
> unsorted then).
>
> Another possibility (if you really need the docs in relevance order) is to
> run the search 2 times:
> First only collect the top n=100 results. The TopDocs instance also returns
> the max results. Using that number you can re-run the query to get all
> ranked results - but this is generally a bad approach, because PQs get
> slower for too many results, where order is not relevant.
Yes thats all fine and is my long term approach, but  isn't there a bug 
here, (in the case that is  failing there is only one actual result ), 
you haven't really addressed why I am geting this exception


Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: java.lang.NegativeArraySizeException on searching using Integer.MAX_VALUE for number of hits

Posted by Uwe Schindler <uw...@thetaphi.de>.
If you want to have all results, you do something wrong. :-)

Full text engines like lucene are made for returning only top-ranking
results. So if you use TopDocs results you must know before how many TopDocs
you want to have. Internally Lucene works with PriorityQueues that filter
the top ranking results.

If you want to have all results, you should not sort them by ranking, which
is not needed then). In this case, implement an own Collector and collect
the results for yourself into e.g. ArrayLists and so on (but they are
unsorted then).

Another possibility (if you really need the docs in relevance order) is to
run the search 2 times:
First only collect the top n=100 results. The TopDocs instance also returns
the max results. Using that number you can re-run the query to get all
ranked results - but this is generally a bad approach, because PQs get
slower for too many results, where order is not relevant.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Paul Taylor [mailto:paul_t100@fastmail.fm]
> Sent: Wednesday, December 02, 2009 2:34 PM
> To: java-user@lucene.apache.org
> Subject: java.lang.NegativeArraySizeException on searching using
> Integer.MAX_VALUE for number of hits
> 
> Hi, just upgraded my code to Lucene 3.0 and on one simple search I get
> the following stacktrace when I pass Integer.MAX_VALUE to the
> Searcher.search(Query query,int n) method, if I change the value to 1000
> it works okay.
> 
> 
> java.lang.NegativeArraySizeException
>     at
> org.apache.lucene.util.PriorityQueue.initialize(PriorityQueue.java:90)
>     at org.apache.lucene.search.HitQueue.<init>(HitQueue.java:67)
>     at
> org.apache.lucene.search.TopScoreDocCollector.<init>(TopScoreDocCollector.
> java:117)
>     at
> org.apache.lucene.search.TopScoreDocCollector.<init>(TopScoreDocCollector.
> java:37)
>     at
> org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.
> <init>(TopScoreDocCollector.java:42)
>     at
> org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.
> <init>(TopScoreDocCollector.java:40)
>     at
> org.apache.lucene.search.TopScoreDocCollector.create(TopScoreDocCollector.
> java:104)
>     at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:167)
>     at org.apache.lucene.search.Searcher.search(Searcher.java:98)
>     at org.apache.lucene.search.Searcher.search(Searcher.java:108)
> 
> Now I know I should specify a max hits value, but I really want to
> return all matches , and regardless it shouldn't throw this exception,
> and in other search code which also has specifies Integer.MAX_VALUE the
> exception is not occurring.
> 
> 
> thanks Paul
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: java.lang.NegativeArraySizeException on searching using Integer.MAX_VALUE for number of hits

Posted by Michael McCandless <lu...@mikemccandless.com>.
OK, I opened https://issues.apache.org/jira/browse/LUCENE-2119

Mike

On Wed, Dec 2, 2009 at 8:34 AM, Paul Taylor <pa...@fastmail.fm> wrote:
> Hi, just upgraded my code to Lucene 3.0 and on one simple search I get the
> following stacktrace when I pass Integer.MAX_VALUE to the
> Searcher.search(Query query,int n) method, if I change the value to 1000 it
> works okay.
>
>
> java.lang.NegativeArraySizeException
>   at org.apache.lucene.util.PriorityQueue.initialize(PriorityQueue.java:90)
>   at org.apache.lucene.search.HitQueue.<init>(HitQueue.java:67)
>   at
> org.apache.lucene.search.TopScoreDocCollector.<init>(TopScoreDocCollector.java:117)
>   at
> org.apache.lucene.search.TopScoreDocCollector.<init>(TopScoreDocCollector.java:37)
>   at
> org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.<init>(TopScoreDocCollector.java:42)
>   at
> org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.<init>(TopScoreDocCollector.java:40)
>   at
> org.apache.lucene.search.TopScoreDocCollector.create(TopScoreDocCollector.java:104)
>   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:167)
>   at org.apache.lucene.search.Searcher.search(Searcher.java:98)
>   at org.apache.lucene.search.Searcher.search(Searcher.java:108)
>
> Now I know I should specify a max hits value, but I really want to return
> all matches , and regardless it shouldn't throw this exception, and in other
> search code which also has specifies Integer.MAX_VALUE the exception is not
> occurring.
>
>
> thanks Paul
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org