You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Mikhail Khludnev <mk...@griddynamics.com> on 2014/05/17 14:20:55 UTC

Re:

For Sure. Lucene's explain is really expensive and is not purposed for
production use, but only for rare troubleshooting. As a mitigation measure
you can scroll result set by small portions more efficient like Hoss
recently explained at SearchHub. In such kind of problems, usually it's
possible to create sort of specialized custom collectors doing something
particular.

Have a god day!


On Sat, May 17, 2014 at 3:01 AM, Tom Burton-West <tb...@umich.edu> wrote:

> Hello all,
>
>
> I'm trying to get relevance scoring information for each of 1,000 docs
> returned for each of 250 queries.    If I run the query (appended below)
> without debugQuery=on, I have no problem with getting all the results
> with under 4GB of memory use.  If I add the parameter &debugQuery=on,
> memory use goes up continuously and after about 20 queries (with 1,000
> results each), memory use reaches about 29.1 GB and the garbage collector
> gives up:
>
> " org.apache.solr.common.SolrException; null:java.lang.RuntimeException:
> java.lang.OutOfMemoryError: GC overhead limit exceeded"
>
> I've attached a jmap -histo, exgerpt below.
>
> Is this a known issue with debugQuery?
>
> Tom
> ----
> query:
>
>
> q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2&
> debugQuery=on
>
> without debugQuery=on:
>
>
> q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2
>
> num       #instances    #bytes  Class description
> --------------------------------------------------------------------------
> 1:              585,559 10,292,067,456  byte[]
> 2:              743,639 18,874,349,592  char[]
> 3:              53,821  91,936,328      long[]
> 4:              70,430  69,234,400      int[]
> 5:              51,348  27,111,744
>  org.apache.lucene.util.fst.FST$Arc[]
> 6:              286,357 20,617,704      org.apache.lucene.util.fst.FST$Arc
> 7:              715,364 17,168,736      java.lang.String
> 8:              79,561  12,547,792      * ConstMethodKlass
> 9:              18,909  11,404,696      short[]
> 10:             345,854 11,067,328      java.util.HashMap$Entry
> 11:             8,823   10,351,024      * ConstantPoolKlass
> 12:             79,561  10,193,328      * MethodKlass
> 13:             228,587 9,143,480
> org.apache.lucene.document.FieldType
> 14:             228,584 9,143,360       org.apache.lucene.document.Field
> 15:             368,423 8,842,152       org.apache.lucene.util.BytesRef
> 16:             210,342 8,413,680       java.util.TreeMap$Entry
> 17:             81,576  8,204,648       java.util.HashMap$Entry[]
> 18:             107,921 7,770,312       org.apache.lucene.util.fst.FST$Arc
> 19:             13,020  6,874,560
> org.apache.lucene.util.fst.FST$Arc[]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>

Re:

Posted by Yonik Seeley <yo...@heliosearch.com>.
On Sat, May 17, 2014 at 12:11 PM, Tom Burton-West <tb...@umich.edu> wrote:
> I understand its expensive, but it appears that it is not freeing up memory
> after each debugQuery is run.  That seems like it should be avoidable (I say
> that without having looked at the code).  Should I open a JIRA about a
> possible memory leak?

Yes, please do!

-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filters&fieldcache

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re:

Posted by Tom Burton-West <tb...@umich.edu>.
Thanks Mikhail,

I understand its expensive, but it appears that it is not freeing up memory
after each debugQuery is run.  That seems like it should be avoidable (I
say that without having looked at the code).  Should I open a JIRA about a
possible memory leak?

Tom


On Sat, May 17, 2014 at 8:20 AM, Mikhail Khludnev <
mkhludnev@griddynamics.com> wrote:

> For Sure. Lucene's explain is really expensive and is not purposed for
> production use, but only for rare troubleshooting. As a mitigation measure
> you can scroll result set by small portions more efficient like Hoss
> recently explained at SearchHub. In such kind of problems, usually it's
> possible to create sort of specialized custom collectors doing something
> particular.
>
> Have a god day!
>
>
> On Sat, May 17, 2014 at 3:01 AM, Tom Burton-West <tb...@umich.edu>wrote:
>
>> Hello all,
>>
>>
>> I'm trying to get relevance scoring information for each of 1,000 docs
>> returned for each of 250 queries.    If I run the query (appended below)
>> without debugQuery=on, I have no problem with getting all the results
>> with under 4GB of memory use.  If I add the parameter &debugQuery=on,
>> memory use goes up continuously and after about 20 queries (with 1,000
>> results each), memory use reaches about 29.1 GB and the garbage collector
>> gives up:
>>
>> " org.apache.solr.common.SolrException; null:java.lang.RuntimeException:
>> java.lang.OutOfMemoryError: GC overhead limit exceeded"
>>
>> I've attached a jmap -histo, exgerpt below.
>>
>> Is this a known issue with debugQuery?
>>
>> Tom
>> ----
>> query:
>>
>>
>> q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2&
>> debugQuery=on
>>
>> without debugQuery=on:
>>
>>
>> q=Abraham+Lincoln&fl=id,score&indent=on&wt=json&start=0&rows=1000&version=2.2
>>
>> num       #instances    #bytes  Class description
>> --------------------------------------------------------------------------
>> 1:              585,559 10,292,067,456  byte[]
>> 2:              743,639 18,874,349,592  char[]
>> 3:              53,821  91,936,328      long[]
>> 4:              70,430  69,234,400      int[]
>> 5:              51,348  27,111,744
>>  org.apache.lucene.util.fst.FST$Arc[]
>> 6:              286,357 20,617,704      org.apache.lucene.util.fst.FST$Arc
>> 7:              715,364 17,168,736      java.lang.String
>> 8:              79,561  12,547,792      * ConstMethodKlass
>> 9:              18,909  11,404,696      short[]
>> 10:             345,854 11,067,328      java.util.HashMap$Entry
>> 11:             8,823   10,351,024      * ConstantPoolKlass
>> 12:             79,561  10,193,328      * MethodKlass
>> 13:             228,587 9,143,480
>> org.apache.lucene.document.FieldType
>> 14:             228,584 9,143,360       org.apache.lucene.document.Field
>> 15:             368,423 8,842,152       org.apache.lucene.util.BytesRef
>> 16:             210,342 8,413,680       java.util.TreeMap$Entry
>> 17:             81,576  8,204,648       java.util.HashMap$Entry[]
>> 18:             107,921 7,770,312       org.apache.lucene.util.fst.FST$Arc
>> 19:             13,020  6,874,560
>> org.apache.lucene.util.fst.FST$Arc[]
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> <mk...@griddynamics.com>
>