You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rebecca Watson <be...@gmail.com> on 2010/08/18 19:54:38 UTC
tii RAM usage on startup
hi,
I am running solr 1.4.1 and java 1.6 with 6GB heap and the following
GC settings:
gc_args="-XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled -XX:NewSize=2g -XX:MaxNewSize=2g
-XX:CMSInitiatingOccupancyFraction=60"
So 6GB total heap and 2GB allocated to eden space.
I have caching, autocommit and auto-warming commented out of
solrconfig.xml
After I index 500k docs and call commit/optimize (via URL after indexing
has completed) my RAM usage is only about 1.5GB, but then if I stop
and restart my Solr server over the same data the RAM immediately
jumps to about 4GB and I can't understand why there is a difference
here? As this is close to the old gen limit -- i quickly find that Solr
becomes unresponsive.
The following shows that tii files are being loaded from 26MB
files to consume over 200MB in RAM when I restart the server.
is this expected?
thanks for any help/advice in advance,
bec :)
-----------------
Rebecca-Watsons-iMac:work iwatson$ jmap -histo:live 8992 | head -30
num #instances #bytes class name
----------------------------------------------
1: 18334714 1422732624 [C
2: 18332491 733299640 java.lang.String
3: 6104929 244197160 org.apache.lucene.index.TermInfo
4: 6104929 244197160 org.apache.lucene.index.TermInfo
5: 6104929 244197160 org.apache.lucene.index.TermInfo
6: 6104921 195357472 org.apache.lucene.index.Term
7: 6104921 195357472 org.apache.lucene.index.Term
8: 6104921 195357472 org.apache.lucene.index.Term
9: 224 146527408 [J
10: 10 48839592 [Lorg.apache.lucene.index.TermInfo;
11: 10 48839592 [Lorg.apache.lucene.index.Term;
12: 10 48839592 [Lorg.apache.lucene.index.TermInfo;
13: 10 48839592 [Lorg.apache.lucene.index.TermInfo;
14: 10 48839592 [Lorg.apache.lucene.index.Term;
15: 10 48839592 [Lorg.apache.lucene.index.Term;
16: 41630 6264728 <constMethodKlass>
17: 41630 5005104 <methodKlass>
18: 4049 4596352 <constantPoolKlass>
19: 4049 3049984 <instanceKlassKlass>
20: 3129 2580040 <constantPoolCacheKlass>
21: 49713 2418496 <symbolKlass>
22: 4983 1067192 [B
23: 4381 806104 java.lang.Class
24: 5979 533064 [[I
25: 6124 438080 [S
26: 7951 381648 java.util.HashMap$Entry
27: 2071 375744 [Ljava.util.HashMap$Entry;
Rebecca-Watsons-iMac:work iwatson$ ls
./mach-lcf/data/data-serv-lcf/artdoc1/index/*.tii
-rw-r--r-- 1 iwatson staff 26M 18 Aug 23:44
./mach-lcf/data/data-serv-lcf/artdoc1/index/_36.tii
-rw-r--r-- 1 iwatson staff 26M 19 Aug 00:06
./mach-lcf/data/data-serv-lcf/artdoc1/index/_69.tii
-rw-r--r-- 1 iwatson staff 25M 19 Aug 00:26
./mach-lcf/data/data-serv-lcf/artdoc1/index/_9d.tii
-rw-r--r-- 1 iwatson staff 24M 19 Aug 00:50
./mach-lcf/data/data-serv-lcf/artdoc1/index/_ch.tii
-rw-r--r-- 1 iwatson staff 25M 19 Aug 01:11
./mach-lcf/data/data-serv-lcf/artdoc1/index/_fj.tii
-rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:12
./mach-lcf/data/data-serv-lcf/artdoc1/index/_fq.tii
-rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:12
./mach-lcf/data/data-serv-lcf/artdoc1/index/_g1.tii
-rw-r--r-- 1 iwatson staff 167B 19 Aug 01:10
./mach-lcf/data/data-serv-lcf/artdoc1/index/_gb.tii
-rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:11
./mach-lcf/data/data-serv-lcf/artdoc1/index/_gc.tii
-rw-r--r-- 1 iwatson staff 223K 19 Aug 01:23
./mach-lcf/data/data-serv-lcf/artdoc1/index/_gd.tii
Re: tii RAM usage on startup
Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
> I'm not sure how Solr exposes this configuration though.
this one?
<!-- To set the setTermIndexInterval, do this: -->
<!--<indexReaderFactory name="IndexReaderFactory"
class="org.apache.solr.core.StandardIndexReaderFactory">
<int name="setTermIndexInterval">12</int>
</indexReaderFactory >-->
Koji
--
http://www.rondhuit.com/en/
(10/08/19 3:36), Michael McCandless wrote:
> I'm not sure why you see 1.5 GB before restart but then 4 GB after.
>
> But seeing a 26 MB tii file --> 200 MB RAM is unfortunately expected;
> in 3.x Lucene's in-RAM representation of the terms index is very
> inefficient (three separate object instances (TermInfo, Term, String)
> per indexed term, with each object having various fields, etc.).
>
> This has been improved substantially in trunk with flexible indexing.
>
> You can increase the terms index divisor when you open your
> IndexReader. EG, passing 2 (instead of the default 1) keeps every
> other indexed term, halving the required RAM (but taking more time to
> seek to a certain term). I'm not sure how Solr exposes this
> configuration though.
>
> Mike
>
> On Wed, Aug 18, 2010 at 1:54 PM, Rebecca Watson<be...@gmail.com> wrote:
>> hi,
>>
>> I am running solr 1.4.1 and java 1.6 with 6GB heap and the following
>> GC settings:
>> gc_args="-XX:+UseConcMarkSweepGC
>> -XX:+CMSClassUnloadingEnabled -XX:NewSize=2g -XX:MaxNewSize=2g
>> -XX:CMSInitiatingOccupancyFraction=60"
>>
>> So 6GB total heap and 2GB allocated to eden space.
>>
>> I have caching, autocommit and auto-warming commented out of
>> solrconfig.xml
>>
>> After I index 500k docs and call commit/optimize (via URL after indexing
>> has completed) my RAM usage is only about 1.5GB, but then if I stop
>> and restart my Solr server over the same data the RAM immediately
>> jumps to about 4GB and I can't understand why there is a difference
>> here? As this is close to the old gen limit -- i quickly find that Solr
>> becomes unresponsive.
>>
>> The following shows that tii files are being loaded from 26MB
>> files to consume over 200MB in RAM when I restart the server.
>>
>> is this expected?
>>
>> thanks for any help/advice in advance,
>>
>> bec :)
>>
>> -----------------
>>
>> Rebecca-Watsons-iMac:work iwatson$ jmap -histo:live 8992 | head -30
>>
>> num #instances #bytes class name
>> ----------------------------------------------
>> 1: 18334714 1422732624 [C
>> 2: 18332491 733299640 java.lang.String
>> 3: 6104929 244197160 org.apache.lucene.index.TermInfo
>> 4: 6104929 244197160 org.apache.lucene.index.TermInfo
>> 5: 6104929 244197160 org.apache.lucene.index.TermInfo
>> 6: 6104921 195357472 org.apache.lucene.index.Term
>> 7: 6104921 195357472 org.apache.lucene.index.Term
>> 8: 6104921 195357472 org.apache.lucene.index.Term
>> 9: 224 146527408 [J
>> 10: 10 48839592 [Lorg.apache.lucene.index.TermInfo;
>> 11: 10 48839592 [Lorg.apache.lucene.index.Term;
>> 12: 10 48839592 [Lorg.apache.lucene.index.TermInfo;
>> 13: 10 48839592 [Lorg.apache.lucene.index.TermInfo;
>> 14: 10 48839592 [Lorg.apache.lucene.index.Term;
>> 15: 10 48839592 [Lorg.apache.lucene.index.Term;
>> 16: 41630 6264728<constMethodKlass>
>> 17: 41630 5005104<methodKlass>
>> 18: 4049 4596352<constantPoolKlass>
>> 19: 4049 3049984<instanceKlassKlass>
>> 20: 3129 2580040<constantPoolCacheKlass>
>> 21: 49713 2418496<symbolKlass>
>> 22: 4983 1067192 [B
>> 23: 4381 806104 java.lang.Class
>> 24: 5979 533064 [[I
>> 25: 6124 438080 [S
>> 26: 7951 381648 java.util.HashMap$Entry
>> 27: 2071 375744 [Ljava.util.HashMap$Entry;
>> Rebecca-Watsons-iMac:work iwatson$ ls
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/*.tii
>> -rw-r--r-- 1 iwatson staff 26M 18 Aug 23:44
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_36.tii
>> -rw-r--r-- 1 iwatson staff 26M 19 Aug 00:06
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_69.tii
>> -rw-r--r-- 1 iwatson staff 25M 19 Aug 00:26
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_9d.tii
>> -rw-r--r-- 1 iwatson staff 24M 19 Aug 00:50
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_ch.tii
>> -rw-r--r-- 1 iwatson staff 25M 19 Aug 01:11
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fj.tii
>> -rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:12
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fq.tii
>> -rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:12
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_g1.tii
>> -rw-r--r-- 1 iwatson staff 167B 19 Aug 01:10
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gb.tii
>> -rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:11
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gc.tii
>> -rw-r--r-- 1 iwatson staff 223K 19 Aug 01:23
>> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gd.tii
>>
Re: tii RAM usage on startup
Posted by Michael McCandless <lu...@mikemccandless.com>.
I'm not sure why you see 1.5 GB before restart but then 4 GB after.
But seeing a 26 MB tii file --> 200 MB RAM is unfortunately expected;
in 3.x Lucene's in-RAM representation of the terms index is very
inefficient (three separate object instances (TermInfo, Term, String)
per indexed term, with each object having various fields, etc.).
This has been improved substantially in trunk with flexible indexing.
You can increase the terms index divisor when you open your
IndexReader. EG, passing 2 (instead of the default 1) keeps every
other indexed term, halving the required RAM (but taking more time to
seek to a certain term). I'm not sure how Solr exposes this
configuration though.
Mike
On Wed, Aug 18, 2010 at 1:54 PM, Rebecca Watson <be...@gmail.com> wrote:
> hi,
>
> I am running solr 1.4.1 and java 1.6 with 6GB heap and the following
> GC settings:
> gc_args="-XX:+UseConcMarkSweepGC
> -XX:+CMSClassUnloadingEnabled -XX:NewSize=2g -XX:MaxNewSize=2g
> -XX:CMSInitiatingOccupancyFraction=60"
>
> So 6GB total heap and 2GB allocated to eden space.
>
> I have caching, autocommit and auto-warming commented out of
> solrconfig.xml
>
> After I index 500k docs and call commit/optimize (via URL after indexing
> has completed) my RAM usage is only about 1.5GB, but then if I stop
> and restart my Solr server over the same data the RAM immediately
> jumps to about 4GB and I can't understand why there is a difference
> here? As this is close to the old gen limit -- i quickly find that Solr
> becomes unresponsive.
>
> The following shows that tii files are being loaded from 26MB
> files to consume over 200MB in RAM when I restart the server.
>
> is this expected?
>
> thanks for any help/advice in advance,
>
> bec :)
>
> -----------------
>
> Rebecca-Watsons-iMac:work iwatson$ jmap -histo:live 8992 | head -30
>
> num #instances #bytes class name
> ----------------------------------------------
> 1: 18334714 1422732624 [C
> 2: 18332491 733299640 java.lang.String
> 3: 6104929 244197160 org.apache.lucene.index.TermInfo
> 4: 6104929 244197160 org.apache.lucene.index.TermInfo
> 5: 6104929 244197160 org.apache.lucene.index.TermInfo
> 6: 6104921 195357472 org.apache.lucene.index.Term
> 7: 6104921 195357472 org.apache.lucene.index.Term
> 8: 6104921 195357472 org.apache.lucene.index.Term
> 9: 224 146527408 [J
> 10: 10 48839592 [Lorg.apache.lucene.index.TermInfo;
> 11: 10 48839592 [Lorg.apache.lucene.index.Term;
> 12: 10 48839592 [Lorg.apache.lucene.index.TermInfo;
> 13: 10 48839592 [Lorg.apache.lucene.index.TermInfo;
> 14: 10 48839592 [Lorg.apache.lucene.index.Term;
> 15: 10 48839592 [Lorg.apache.lucene.index.Term;
> 16: 41630 6264728 <constMethodKlass>
> 17: 41630 5005104 <methodKlass>
> 18: 4049 4596352 <constantPoolKlass>
> 19: 4049 3049984 <instanceKlassKlass>
> 20: 3129 2580040 <constantPoolCacheKlass>
> 21: 49713 2418496 <symbolKlass>
> 22: 4983 1067192 [B
> 23: 4381 806104 java.lang.Class
> 24: 5979 533064 [[I
> 25: 6124 438080 [S
> 26: 7951 381648 java.util.HashMap$Entry
> 27: 2071 375744 [Ljava.util.HashMap$Entry;
> Rebecca-Watsons-iMac:work iwatson$ ls
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/*.tii
> -rw-r--r-- 1 iwatson staff 26M 18 Aug 23:44
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_36.tii
> -rw-r--r-- 1 iwatson staff 26M 19 Aug 00:06
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_69.tii
> -rw-r--r-- 1 iwatson staff 25M 19 Aug 00:26
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_9d.tii
> -rw-r--r-- 1 iwatson staff 24M 19 Aug 00:50
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_ch.tii
> -rw-r--r-- 1 iwatson staff 25M 19 Aug 01:11
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fj.tii
> -rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:12
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fq.tii
> -rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:12
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_g1.tii
> -rw-r--r-- 1 iwatson staff 167B 19 Aug 01:10
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gb.tii
> -rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:11
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gc.tii
> -rw-r--r-- 1 iwatson staff 223K 19 Aug 01:23
> ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gd.tii
>