You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Tom Burton-West <tb...@umich.edu> on 2014/05/17 01:01:15 UTC
(Unknown)
Hello all,
I'm trying to get relevance scoring information for each of 1,000 docs
returned for each of 250 queries. If I run the query (appended below)
without debugQuery=on, I have no problem with getting all the results with
under 4GB of memory use. If I add the parameter &debugQuery=on, memory use
goes up continuously and after about 20 queries (with 1,000 results each),
memory use reaches about 29.1 GB and the garbage collector gives up:
" org.apache.solr.common.SolrException; null:java.lang.RuntimeException:
java.lang.OutOfMemoryError: GC overhead limit exceeded"
I've attached a jmap -histo, exgerpt below.
Is this a known issue with debugQuery?
Tom
----
query:
q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2&
debugQuery=on
without debugQuery=on:
q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2
num #instances #bytes Class description
--------------------------------------------------------------------------
1: 585,559 10,292,067,456 byte[]
2: 743,639 18,874,349,592 char[]
3: 53,821 91,936,328 long[]
4: 70,430 69,234,400 int[]
5: 51,348 27,111,744 org.apache.lucene.util.fst.FST$Arc[]
6: 286,357 20,617,704 org.apache.lucene.util.fst.FST$Arc
7: 715,364 17,168,736 java.lang.String
8: 79,561 12,547,792 * ConstMethodKlass
9: 18,909 11,404,696 short[]
10: 345,854 11,067,328 java.util.HashMap$Entry
11: 8,823 10,351,024 * ConstantPoolKlass
12: 79,561 10,193,328 * MethodKlass
13: 228,587 9,143,480 org.apache.lucene.document.FieldType
14: 228,584 9,143,360 org.apache.lucene.document.Field
15: 368,423 8,842,152 org.apache.lucene.util.BytesRef
16: 210,342 8,413,680 java.util.TreeMap$Entry
17: 81,576 8,204,648 java.util.HashMap$Entry[]
18: 107,921 7,770,312 org.apache.lucene.util.fst.FST$Arc
19: 13,020 6,874,560 org.apache.lucene.util.fst.FST$Arc[]
Re:
Posted by Yonik Seeley <yo...@heliosearch.com>.
On Sat, May 17, 2014 at 12:11 PM, Tom Burton-West <tb...@umich.edu> wrote:
> I understand its expensive, but it appears that it is not freeing up memory
> after each debugQuery is run. That seems like it should be avoidable (I say
> that without having looked at the code). Should I open a JIRA about a
> possible memory leak?
Yes, please do!
-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filters&fieldcache
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re:
Posted by Tom Burton-West <tb...@umich.edu>.
Thanks Mikhail,
I understand its expensive, but it appears that it is not freeing up memory
after each debugQuery is run. That seems like it should be avoidable (I
say that without having looked at the code). Should I open a JIRA about a
possible memory leak?
Tom
On Sat, May 17, 2014 at 8:20 AM, Mikhail Khludnev <
mkhludnev@griddynamics.com> wrote:
> For Sure. Lucene's explain is really expensive and is not purposed for
> production use, but only for rare troubleshooting. As a mitigation measure
> you can scroll result set by small portions more efficient like Hoss
> recently explained at SearchHub. In such kind of problems, usually it's
> possible to create sort of specialized custom collectors doing something
> particular.
>
> Have a god day!
>
>
> On Sat, May 17, 2014 at 3:01 AM, Tom Burton-West <tb...@umich.edu>wrote:
>
>> Hello all,
>>
>>
>> I'm trying to get relevance scoring information for each of 1,000 docs
>> returned for each of 250 queries. If I run the query (appended below)
>> without debugQuery=on, I have no problem with getting all the results
>> with under 4GB of memory use. If I add the parameter &debugQuery=on,
>> memory use goes up continuously and after about 20 queries (with 1,000
>> results each), memory use reaches about 29.1 GB and the garbage collector
>> gives up:
>>
>> " org.apache.solr.common.SolrException; null:java.lang.RuntimeException:
>> java.lang.OutOfMemoryError: GC overhead limit exceeded"
>>
>> I've attached a jmap -histo, exgerpt below.
>>
>> Is this a known issue with debugQuery?
>>
>> Tom
>> ----
>> query:
>>
>>
>> q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2&
>> debugQuery=on
>>
>> without debugQuery=on:
>>
>>
>> q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2
>>
>> num #instances #bytes Class description
>> --------------------------------------------------------------------------
>> 1: 585,559 10,292,067,456 byte[]
>> 2: 743,639 18,874,349,592 char[]
>> 3: 53,821 91,936,328 long[]
>> 4: 70,430 69,234,400 int[]
>> 5: 51,348 27,111,744
>> org.apache.lucene.util.fst.FST$Arc[]
>> 6: 286,357 20,617,704 org.apache.lucene.util.fst.FST$Arc
>> 7: 715,364 17,168,736 java.lang.String
>> 8: 79,561 12,547,792 * ConstMethodKlass
>> 9: 18,909 11,404,696 short[]
>> 10: 345,854 11,067,328 java.util.HashMap$Entry
>> 11: 8,823 10,351,024 * ConstantPoolKlass
>> 12: 79,561 10,193,328 * MethodKlass
>> 13: 228,587 9,143,480
>> org.apache.lucene.document.FieldType
>> 14: 228,584 9,143,360 org.apache.lucene.document.Field
>> 15: 368,423 8,842,152 org.apache.lucene.util.BytesRef
>> 16: 210,342 8,413,680 java.util.TreeMap$Entry
>> 17: 81,576 8,204,648 java.util.HashMap$Entry[]
>> 18: 107,921 7,770,312 org.apache.lucene.util.fst.FST$Arc
>> 19: 13,020 6,874,560
>> org.apache.lucene.util.fst.FST$Arc[]
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> <mk...@griddynamics.com>
>
Re:
Posted by Mikhail Khludnev <mk...@griddynamics.com>.
For Sure. Lucene's explain is really expensive and is not purposed for
production use, but only for rare troubleshooting. As a mitigation measure
you can scroll result set by small portions more efficient like Hoss
recently explained at SearchHub. In such kind of problems, usually it's
possible to create sort of specialized custom collectors doing something
particular.
Have a god day!
On Sat, May 17, 2014 at 3:01 AM, Tom Burton-West <tb...@umich.edu> wrote:
> Hello all,
>
>
> I'm trying to get relevance scoring information for each of 1,000 docs
> returned for each of 250 queries. If I run the query (appended below)
> without debugQuery=on, I have no problem with getting all the results
> with under 4GB of memory use. If I add the parameter &debugQuery=on,
> memory use goes up continuously and after about 20 queries (with 1,000
> results each), memory use reaches about 29.1 GB and the garbage collector
> gives up:
>
> " org.apache.solr.common.SolrException; null:java.lang.RuntimeException:
> java.lang.OutOfMemoryError: GC overhead limit exceeded"
>
> I've attached a jmap -histo, exgerpt below.
>
> Is this a known issue with debugQuery?
>
> Tom
> ----
> query:
>
>
> q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2&
> debugQuery=on
>
> without debugQuery=on:
>
>
> q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2
>
> num #instances #bytes Class description
> --------------------------------------------------------------------------
> 1: 585,559 10,292,067,456 byte[]
> 2: 743,639 18,874,349,592 char[]
> 3: 53,821 91,936,328 long[]
> 4: 70,430 69,234,400 int[]
> 5: 51,348 27,111,744
> org.apache.lucene.util.fst.FST$Arc[]
> 6: 286,357 20,617,704 org.apache.lucene.util.fst.FST$Arc
> 7: 715,364 17,168,736 java.lang.String
> 8: 79,561 12,547,792 * ConstMethodKlass
> 9: 18,909 11,404,696 short[]
> 10: 345,854 11,067,328 java.util.HashMap$Entry
> 11: 8,823 10,351,024 * ConstantPoolKlass
> 12: 79,561 10,193,328 * MethodKlass
> 13: 228,587 9,143,480
> org.apache.lucene.document.FieldType
> 14: 228,584 9,143,360 org.apache.lucene.document.Field
> 15: 368,423 8,842,152 org.apache.lucene.util.BytesRef
> 16: 210,342 8,413,680 java.util.TreeMap$Entry
> 17: 81,576 8,204,648 java.util.HashMap$Entry[]
> 18: 107,921 7,770,312 org.apache.lucene.util.fst.FST$Arc
> 19: 13,020 6,874,560
> org.apache.lucene.util.fst.FST$Arc[]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics
<http://www.griddynamics.com>
<mk...@griddynamics.com>