You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rebecca Watson <be...@gmail.com> on 2010/08/18 19:54:38 UTC

tii RAM usage on startup

hi,

I am running solr 1.4.1 and java 1.6 with 6GB heap and the following
GC settings:
gc_args="-XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled	-XX:NewSize=2g -XX:MaxNewSize=2g
-XX:CMSInitiatingOccupancyFraction=60"

So 6GB total heap and 2GB allocated to eden space.

I have caching, autocommit and auto-warming commented out of
solrconfig.xml

After I index 500k docs and call commit/optimize (via URL after indexing
has completed) my RAM usage is only about 1.5GB, but then if I stop
and restart my Solr server over the same data the RAM immediately
jumps to about 4GB and I can't understand why there is a difference
here? As this is close to the old gen limit -- i quickly find that Solr
becomes unresponsive.

The following shows that tii files are being loaded from 26MB
files to consume over 200MB in RAM when I restart the server.

is this expected?

thanks for any help/advice in advance,

bec :)

-----------------

Rebecca-Watsons-iMac:work iwatson$ jmap -histo:live 8992 | head -30

 num     #instances         #bytes  class name
----------------------------------------------
   1:      18334714     1422732624  [C
   2:      18332491      733299640  java.lang.String
   3:       6104929      244197160  org.apache.lucene.index.TermInfo
   4:       6104929      244197160  org.apache.lucene.index.TermInfo
   5:       6104929      244197160  org.apache.lucene.index.TermInfo
   6:       6104921      195357472  org.apache.lucene.index.Term
   7:       6104921      195357472  org.apache.lucene.index.Term
   8:       6104921      195357472  org.apache.lucene.index.Term
   9:           224      146527408  [J
  10:            10       48839592  [Lorg.apache.lucene.index.TermInfo;
  11:            10       48839592  [Lorg.apache.lucene.index.Term;
  12:            10       48839592  [Lorg.apache.lucene.index.TermInfo;
  13:            10       48839592  [Lorg.apache.lucene.index.TermInfo;
  14:            10       48839592  [Lorg.apache.lucene.index.Term;
  15:            10       48839592  [Lorg.apache.lucene.index.Term;
  16:         41630        6264728  <constMethodKlass>
  17:         41630        5005104  <methodKlass>
  18:          4049        4596352  <constantPoolKlass>
  19:          4049        3049984  <instanceKlassKlass>
  20:          3129        2580040  <constantPoolCacheKlass>
  21:         49713        2418496  <symbolKlass>
  22:          4983        1067192  [B
  23:          4381         806104  java.lang.Class
  24:          5979         533064  [[I
  25:          6124         438080  [S
  26:          7951         381648  java.util.HashMap$Entry
  27:          2071         375744  [Ljava.util.HashMap$Entry;
Rebecca-Watsons-iMac:work iwatson$ ls
./mach-lcf/data/data-serv-lcf/artdoc1/index/*.tii
-rw-r--r--  1 iwatson  staff    26M 18 Aug 23:44
./mach-lcf/data/data-serv-lcf/artdoc1/index/_36.tii
-rw-r--r--  1 iwatson  staff    26M 19 Aug 00:06
./mach-lcf/data/data-serv-lcf/artdoc1/index/_69.tii
-rw-r--r--  1 iwatson  staff    25M 19 Aug 00:26
./mach-lcf/data/data-serv-lcf/artdoc1/index/_9d.tii
-rw-r--r--  1 iwatson  staff    24M 19 Aug 00:50
./mach-lcf/data/data-serv-lcf/artdoc1/index/_ch.tii
-rw-r--r--  1 iwatson  staff    25M 19 Aug 01:11
./mach-lcf/data/data-serv-lcf/artdoc1/index/_fj.tii
-rw-r--r--  1 iwatson  staff   3.1M 19 Aug 01:12
./mach-lcf/data/data-serv-lcf/artdoc1/index/_fq.tii
-rw-r--r--  1 iwatson  staff   3.1M 19 Aug 01:12
./mach-lcf/data/data-serv-lcf/artdoc1/index/_g1.tii
-rw-r--r--  1 iwatson  staff   167B 19 Aug 01:10
./mach-lcf/data/data-serv-lcf/artdoc1/index/_gb.tii
-rw-r--r--  1 iwatson  staff   3.1M 19 Aug 01:11
./mach-lcf/data/data-serv-lcf/artdoc1/index/_gc.tii
-rw-r--r--  1 iwatson  staff   223K 19 Aug 01:23
./mach-lcf/data/data-serv-lcf/artdoc1/index/_gd.tii

Re: tii RAM usage on startup

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
  > I'm not sure how Solr exposes this configuration though.

this one?

<!-- To set the setTermIndexInterval, do this: -->
<!--<indexReaderFactory name="IndexReaderFactory" 
class="org.apache.solr.core.StandardIndexReaderFactory">
<int name="setTermIndexInterval">12</int>
</indexReaderFactory >-->

Koji

-- 
http://www.rondhuit.com/en/



(10/08/19 3:36), Michael McCandless wrote:
> I'm not sure why you see 1.5 GB before restart but then 4 GB after.
>
> But seeing a 26 MB tii file -->  200 MB RAM is unfortunately expected;
> in 3.x Lucene's in-RAM representation of the terms index is very
> inefficient (three separate object instances (TermInfo, Term, String)
> per indexed term, with each object having various fields, etc.).
>
> This has been improved substantially in trunk with flexible indexing.
>
> You can increase the terms index divisor when you open your
> IndexReader.  EG, passing 2 (instead of the default 1) keeps every
> other indexed term, halving the required RAM (but taking more time to
> seek to a certain term).  I'm not sure how Solr exposes this
> configuration though.
>
> Mike
>
> On Wed, Aug 18, 2010 at 1:54 PM, Rebecca Watson<be...@gmail.com>  wrote:
>> hi,
>>
>> I am running solr 1.4.1 and java 1.6 with 6GB heap and the following
>> GC settings:
>> gc_args="-XX:+UseConcMarkSweepGC
>> -XX:+CMSClassUnloadingEnabled   -XX:NewSize=2g -XX:MaxNewSize=2g
>> -XX:CMSInitiatingOccupancyFraction=60"
>>
>> So 6GB total heap and 2GB allocated to eden space.
>>
>> I have caching, autocommit and auto-warming commented out of
>> solrconfig.xml
>>
>> After I index 500k docs and call commit/optimize (via URL after indexing
>> has completed) my RAM usage is only about 1.5GB, but then if I stop
>> and restart my Solr server over the same data the RAM immediately
>> jumps to about 4GB and I can't understand why there is a difference
>> here? As this is close to the old gen limit -- i quickly find that Solr
>> becomes unresponsive.
>>
>> The following shows that tii files are being loaded from 26MB
>> files to consume over 200MB in RAM when I restart the server.
>>
>> is this expected?
>>
>> thanks for any help/advice in advance,
>>
>> bec :)
>>
>> -----------------
>>
>> Rebecca-Watsons-iMac:work iwatson$ jmap -histo:live 8992 | head -30
>>
>>   num     #instances         #bytes  class name
>> ----------------------------------------------
>>    1:      18334714     1422732624  [C
>>    2:      18332491      733299640  java.lang.String
>>    3:       6104929      244197160  org.apache.lucene.index.TermInfo
>>    4:       6104929      244197160  org.apache.lucene.index.TermInfo
>>    5:       6104929      244197160  org.apache.lucene.index.TermInfo
>>    6:       6104921      195357472  org.apache.lucene.index.Term
>>    7:       6104921      195357472  org.apache.lucene.index.Term
>>    8:       6104921      195357472  org.apache.lucene.index.Term
>>    9:           224      146527408  [J
>>   10:            10       48839592  [Lorg.apache.lucene.index.TermInfo;
>>   11:            10       48839592  [Lorg.apache.lucene.index.Term;
>>   12:            10       48839592  [Lorg.apache.lucene.index.TermInfo;
>>   13:            10       48839592  [Lorg.apache.lucene.index.TermInfo;
>>   14:            10       48839592  [Lorg.apache.lucene.index.Term;
>>   15:            10       48839592  [Lorg.apache.lucene.index.Term;
>>   16:         41630        6264728<constMethodKlass>
>>   17:         41630        5005104<methodKlass>
>>   18:          4049        4596352<constantPoolKlass>
>>   19:          4049        3049984<instanceKlassKlass>
>>   20:          3129        2580040<constantPoolCacheKlass>
>>   21:         49713        2418496<symbolKlass>
>>   22:          4983        1067192  [B
>>   23:          4381         806104  java.lang.Class
>>   24:          5979         533064  [[I
>>   25:          6124         438080  [S
>>   26:          7951         381648  java.util.HashMap$Entry
>>   27:          2071         375744  [Ljava.util.HashMap$Entry;
>> Rebecca-Watsons-iMac:work iwatson$ ls
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/*.tii
>> -rw-r--r--  1 iwatson  staff    26M 18 Aug 23:44
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_36.tii
>> -rw-r--r--  1 iwatson  staff    26M 19 Aug 00:06
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_69.tii
>> -rw-r--r--  1 iwatson  staff    25M 19 Aug 00:26
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_9d.tii
>> -rw-r--r--  1 iwatson  staff    24M 19 Aug 00:50
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_ch.tii
>> -rw-r--r--  1 iwatson  staff    25M 19 Aug 01:11
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fj.tii
>> -rw-r--r--  1 iwatson  staff   3.1M 19 Aug 01:12
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fq.tii
>> -rw-r--r--  1 iwatson  staff   3.1M 19 Aug 01:12
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_g1.tii
>> -rw-r--r--  1 iwatson  staff   167B 19 Aug 01:10
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gb.tii
>> -rw-r--r--  1 iwatson  staff   3.1M 19 Aug 01:11
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gc.tii
>> -rw-r--r--  1 iwatson  staff   223K 19 Aug 01:23
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gd.tii
>>



Re: tii RAM usage on startup

Posted by Michael McCandless <lu...@mikemccandless.com>.
I'm not sure why you see 1.5 GB before restart but then 4 GB after.

But seeing a 26 MB tii file --> 200 MB RAM is unfortunately expected;
in 3.x Lucene's in-RAM representation of the terms index is very
inefficient (three separate object instances (TermInfo, Term, String)
per indexed term, with each object having various fields, etc.).

This has been improved substantially in trunk with flexible indexing.

You can increase the terms index divisor when you open your
IndexReader.  EG, passing 2 (instead of the default 1) keeps every
other indexed term, halving the required RAM (but taking more time to
seek to a certain term).  I'm not sure how Solr exposes this
configuration though.

Mike

On Wed, Aug 18, 2010 at 1:54 PM, Rebecca Watson <be...@gmail.com> wrote:
> hi,
>
> I am running solr 1.4.1 and java 1.6 with 6GB heap and the following
> GC settings:
> gc_args="-XX:+UseConcMarkSweepGC
> -XX:+CMSClassUnloadingEnabled   -XX:NewSize=2g -XX:MaxNewSize=2g
> -XX:CMSInitiatingOccupancyFraction=60"
>
> So 6GB total heap and 2GB allocated to eden space.
>
> I have caching, autocommit and auto-warming commented out of
> solrconfig.xml
>
> After I index 500k docs and call commit/optimize (via URL after indexing
> has completed) my RAM usage is only about 1.5GB, but then if I stop
> and restart my Solr server over the same data the RAM immediately
> jumps to about 4GB and I can't understand why there is a difference
> here? As this is close to the old gen limit -- i quickly find that Solr
> becomes unresponsive.
>
> The following shows that tii files are being loaded from 26MB
> files to consume over 200MB in RAM when I restart the server.
>
> is this expected?
>
> thanks for any help/advice in advance,
>
> bec :)
>
> -----------------
>
> Rebecca-Watsons-iMac:work iwatson$ jmap -histo:live 8992 | head -30
>
>  num     #instances         #bytes  class name
> ----------------------------------------------
>   1:      18334714     1422732624  [C
>   2:      18332491      733299640  java.lang.String
>   3:       6104929      244197160  org.apache.lucene.index.TermInfo
>   4:       6104929      244197160  org.apache.lucene.index.TermInfo
>   5:       6104929      244197160  org.apache.lucene.index.TermInfo
>   6:       6104921      195357472  org.apache.lucene.index.Term
>   7:       6104921      195357472  org.apache.lucene.index.Term
>   8:       6104921      195357472  org.apache.lucene.index.Term
>   9:           224      146527408  [J
>  10:            10       48839592  [Lorg.apache.lucene.index.TermInfo;
>  11:            10       48839592  [Lorg.apache.lucene.index.Term;
>  12:            10       48839592  [Lorg.apache.lucene.index.TermInfo;
>  13:            10       48839592  [Lorg.apache.lucene.index.TermInfo;
>  14:            10       48839592  [Lorg.apache.lucene.index.Term;
>  15:            10       48839592  [Lorg.apache.lucene.index.Term;
>  16:         41630        6264728  <constMethodKlass>
>  17:         41630        5005104  <methodKlass>
>  18:          4049        4596352  <constantPoolKlass>
>  19:          4049        3049984  <instanceKlassKlass>
>  20:          3129        2580040  <constantPoolCacheKlass>
>  21:         49713        2418496  <symbolKlass>
>  22:          4983        1067192  [B
>  23:          4381         806104  java.lang.Class
>  24:          5979         533064  [[I
>  25:          6124         438080  [S
>  26:          7951         381648  java.util.HashMap$Entry
>  27:          2071         375744  [Ljava.util.HashMap$Entry;
> Rebecca-Watsons-iMac:work iwatson$ ls
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/*.tii
> -rw-r--r--  1 iwatson  staff    26M 18 Aug 23:44
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_36.tii
> -rw-r--r--  1 iwatson  staff    26M 19 Aug 00:06
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_69.tii
> -rw-r--r--  1 iwatson  staff    25M 19 Aug 00:26
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_9d.tii
> -rw-r--r--  1 iwatson  staff    24M 19 Aug 00:50
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_ch.tii
> -rw-r--r--  1 iwatson  staff    25M 19 Aug 01:11
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fj.tii
> -rw-r--r--  1 iwatson  staff   3.1M 19 Aug 01:12
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fq.tii
> -rw-r--r--  1 iwatson  staff   3.1M 19 Aug 01:12
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_g1.tii
> -rw-r--r--  1 iwatson  staff   167B 19 Aug 01:10
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gb.tii
> -rw-r--r--  1 iwatson  staff   3.1M 19 Aug 01:11
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gc.tii
> -rw-r--r--  1 iwatson  staff   223K 19 Aug 01:23
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gd.tii
>