You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rahul R <ra...@gmail.com> on 2012/04/30 11:23:56 UTC

Lucene FieldCache - Out of memory exception

Hello,
I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application server
on Solaris. I use embedded solr server. More details :
Number of docs in solr index : 1.4 million
Physical size of index : 640MB
Total number of fields in the index : 700 (99% of these are dynamic fields)
Total number of fields enabled for faceting : 440
Avg number of facet fields participating in a faceted query : 50-70
Total RAM allocated to weblogic appserver : 3GB (max possible)

In a multi user environment with 3 users using this application for a
period of around 40 minutes, the application runs out of memory. Analysis
of the heap dump shows that almost 85% of the memory is retained by the
FieldCache. Now I understand that the field cache is out of our control but
would appreciate some suggestions on how to handle this issue.

Some questions on this front :
- some mail threads on this forum seem to indicate that there could be some
connection between having dynamic fields and usage of FieldCache. Is this
true ? Most of the fields in my index are dynamic fields.
- as mentioned above, most of my faceted queries could have around 50-70
facet fields (I would do SolrQuery.addFacetField() for around 50-70 fields
per query). Could this be the source of the problem ? Is this too high for
solr to support ?
- Initially, I had a facet.sort defined in solrconfig.xml. Since FieldCache
builds up on sorting, I even removed the facet.sort and tried, but no
respite. The behavior is same as before.
- The document id that I have for each document is quite big (around 50
characters on average). Can this be a problem ? I reduced this to around 15
characters and tried but still there is no improvement.
- Can the size of the data be a problem ? But on this forum, I see many
users talking of more than 100 million documents in their index. I have
only 1.4 million with physical size of 640MB. The physical server on which
this application is running, has sufficient RAM and CPU.
- What gets stored in the FieldCache ? Is it the entire document or just
the document Id ?


Any help is much appreciated. Thank you.

regards
Rahul

Re: Lucene FieldCache - Out of memory exception

Posted by Chris Hostetter <ho...@fucit.org>.
: I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application server
: on Solaris. I use embedded solr server. More details :

FWIW: Solr 1.3 is *REALLY* old ... do not be suprised if much of the info 
you are given (or read) doesn't apply.

: - some mail threads on this forum seem to indicate that there could be some
: connection between having dynamic fields and usage of FieldCache. Is this
: true ? Most of the fields in my index are dynamic fields.

there is no specific corrolation between dynamic fields and the field 
cache -- what you may be seeing is people commenting about dangers of 
*using* field caches with dynamic fields, because typically when people 
use dynamic fields there is no fixed number of pre-defined fields in use 
(that's the whole perk of dynamic fields) so if you are using hundreds or 
thousands of dynamic field in a way that involves the field cache, you 
might have problems (because field cache objects tend to be large)

: - as mentioned above, most of my faceted queries could have around 50-70
: facet fields (I would do SolrQuery.addFacetField() for around 50-70 fields
: per query). Could this be the source of the problem ? Is this too high for
: solr to support ?

In Solr 1.3, faceting does not use the field cache AT ALL!

starting with Solr 1.4, facting can use the field cache (or a similar 
concept called "UnInvertedFields" when multivalued).  You can force 
Solr1.4+ not to use the fielld cache for this by specifying 
facet.method=enum

https://wiki.apache.org/solr/SimpleFacetParameters#facet.method

: - Initially, I had a facet.sort defined in solrconfig.xml. Since FieldCache
: builds up on sorting, I even removed the facet.sort and tried, but no
: respite. The behavior is same as before.

Facet sorting is not the same as result sorting. facet sorting does not 
use the field cache at all.

nothing you've mentioned in your initial email, or the example query you 
posted should involve the field cache in anyway (in Solr 1.3!) so if you 
are seeing your heap eaten up by field cache objects there is more going 
on in your system then you know about (or that you've told us) ... you 
need to look at the fields assocaited with those field caches, and then 
see how you are using those fields in requests, to make sense of what they 
exist. in your heap.



-Hoss

Re: Lucene FieldCache - Out of memory exception

Posted by Rahul R <ra...@gmail.com>.
A update on the things I tried today. Since multiValued fields do not use
the fieldCache, I changed my schema to define all my fields as multiValued
fields. Although these fields need to be only single valued, I made this
change and recreated the index and tested with it. Observations :
- force GC always results in freeing up most of the heap i.e the FieldCache
doesn't seem to be created. So OOM issue does not occur.
- response time is terribly slow for faceting queries. Application is
almost unusable and system monitoring shows high CPU usage.
- using solr caches - documentCache, filterCache & queryResultsCache - does
not seem to improve performance. Cache sizes are documentCache - 100K,
filterCache - 10K, queryResultsCache - 10K.

I don't think I can use this as a solution because response times are very
poor. But a few questions :
- solr documentation indicates that the fieldCache gets built up on sorting
and function queries only. When I use single Valued fields, I don't do any
explicit sorting or use any functions. Could there be some setting that
results in automatic sorting to happen on the result set (although I don't
want a sort) ?
- is there a way I can improve faceting performance with all my fields as
multiValued fields ?

Appreciate any help on this. Thank you.

- Rahul

On Mon, May 7, 2012 at 7:23 PM, Rahul R <ra...@gmail.com> wrote:

> Jack,
> Sorry for the delayed response:
> Total memory allocated : 3GB
> Free Memory on startup of application server : 2.85GB (95%)
> Free Memory after first request by first user(1 request involves 3
> queries) : 2.7GB (90%)
> Free Memory after a few requests by same user : 2.52GB (84%)
>
> All values recorded above have been done after 2 force GCs were done to
> identify the free memory.
>
> The progression of memory usage looks quite high with the above numbers.
> As the number of searches widen, the speed of memory consumption decreases.
> But at some point it does hit OOM.
>
> - Rahul
>
>
> On Thu, May 3, 2012 at 8:37 PM, Jack Krupansky <ja...@basetechnology.com>wrote:
>
>> Just for a baseline, how much memory is available in the JVM (using
>> jconsole or something similar) before you do your first query, and then
>> after your first query (that has these 50-70 facets), and then after a few
>> different queries (different facets.) Just to see how close you are to "the
>> edge" even before a volume of queries start coming in.
>>
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Rahul R
>> Sent: Thursday, May 03, 2012 1:28 AM
>>
>> To: solr-user@lucene.apache.org
>> Subject: Re: Lucene FieldCache - Out of memory exception
>>
>> Jack,
>> Yes, the queries work fine till I hit the OOM. The fields that start with
>> S_* are strings, F_* are floats, I_* are ints and so so. The dynamic field
>> definitions from schema.xml :
>> <dynamicField name="S_*" type="string"    indexed="true"  stored="true"
>> omitNorms="true"/>
>>  <dynamicField name="I_*" type="sint"    indexed="true"  stored="true"
>> omitNorms="true"/>
>>  <dynamicField name="F_*" type="sfloat"    indexed="true"  stored="true"
>> omitNorms="true"/>
>>  <dynamicField name="D_*" type="date"    indexed="true"  stored="true"
>> omitNorms="true"/>
>>  <dynamicField name="B_*" type="boolean"    indexed="true"  stored="true"
>> omitNorms="true"/>
>>
>> *Each FieldCache will be an array with maxdoc entries (your total number
>> of
>>
>> documents - 1.4 million) times the size of the field value or whatever a
>> string reference is in your JVM*
>>
>> So if I understand correct - every field (dynamic or normal) will have its
>> own field cache. The size of the field cache for any field will be
>> (maxDocs
>> * sizeOfField) ? If the field has only 100 unique values, will it occupy
>> (100 * sizeOfField) or will it still be (maxDocs * sizeOfField) ?
>>
>> *Roughly what is the typical or average length of one of your facet field
>>
>> values? And, on average, how many unique terms are there within a typical
>> faceted field?*
>>
>> Each field length may vary from 10 - 30 characters. Average of 20 maybe.
>> Number of unique terms within a faceted field will vary from 100 - 1000.
>> Average of 300. How will the number of unique terms affect performance ?
>>
>> *3 GB sounds like it might not be enough for such heavy use of faceting.
>> It
>>
>> is probably not the 50-70 number, but the 440 or accumulated number across
>> many queries that pushes the memory usage up*
>>
>> I am using jdk1.5.0_14 - 32 bit. With 32 bit jdk, I think there is a
>> limitation that more RAM cannot be allocated.
>>
>> *When you hit OOM, what does the Solr admin stats display say for
>> FieldCache?*
>>
>> I don't have solr deployed as a separate web app. All solr jar files are
>> present in my webapp's WEB-INF\lib directory. I use EmbeddedSolrServer. So
>> is there a way I can get this information that the admin would show ?
>>
>> Thank you for your time.
>>
>> -Rahul
>>
>>
>> On Wed, May 2, 2012 at 5:19 PM, Jack Krupansky <ja...@basetechnology.com>*
>> *wrote:
>>
>>  The FieldCache gets populated the first time a given field is referenced
>>> as a facet and then will stay around forever. So, as additional queries
>>> get
>>> executed with different facet fields, the number of FieldCache entries
>>> will
>>> grow.
>>>
>>> If I understand what you have said, theses faceted queries do work
>>> initially, but after awhile they stop working with OOM, correct?
>>>
>>> The size of a single FieldCache depends on the field type. Since you are
>>> using dynamic fields, it depends on your "dynamicField" types - which you
>>> have not told us about. From your query I see that your fields start with
>>> "S_" and "F_" - presumably you have dynamic field types "S_*" and "F_*"?
>>> Are they strings, integers, floats, or what?
>>>
>>> Each FieldCache will be an array with maxdoc entries (your total number
>>> of
>>> documents - 1.4 million) times the size of the field value or whatever a
>>> string reference is in your JVM.
>>>
>>> String fields will take more space than numeric fields for the
>>> FieldCache,
>>> since a separate table is maintained for the unique terms in that field.
>>> Roughly what is the typical or average length of one of your facet field
>>> values? And, on average, how many unique terms are there within a typical
>>> faceted field?
>>>
>>> If you can convert many of these faceted fields to simple integers the
>>> size should go down dramatically, but that depends on your application.
>>>
>>> 3 GB sounds like it might not be enough for such heavy use of faceting.
>>> It
>>> is probably not the 50-70 number, but the 440 or accumulated number
>>> across
>>> many queries that pushes the memory usage up.
>>>
>>> When you hit OOM, what does the Solr admin stats display say for
>>> FieldCache?
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: Rahul R
>>> Sent: Wednesday, May 02, 2012 2:22 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Lucene FieldCache - Out of memory exception
>>>
>>>
>>> Here is one sample query that I picked up from the log file :
>>>
>>> q=*%3A*&fq=Category%3A%223__****107%22&fq=S_P1540477699%3A%**
>>> 22MICROCIRCUIT%2C+LINE+****TRANSCEIVERS%22&rows=0&facet=****
>>> true&facet.mincount=1&facet.****limit=2&facet.field=S_**
>>> C1503120369&facet.field=S_****P1406389942&facet.field=S_**
>>> P1430116878&facet.field=S_****P1430116881&facet.field=S_**
>>> P1406453552&facet.field=S_****P1406451296&facet.field=S_**
>>> P1406452465&facet.field=S_****C2968809156&facet.field=S_**
>>> P1406389980&facet.field=S_****P1540477699&facet.field=S_**
>>> P1406389982&facet.field=S_****P1406389984&facet.field=S_**
>>> P1406451284&facet.field=S_****P1406389926&facet.field=S_**
>>> P1424886581&facet.field=S_****P2017662632&facet.field=F_**
>>> P1946367021&facet.field=S_****P1430116884&facet.field=S_**
>>> P2017662620&facet.field=F_****P1406451304&facet.field=F_**
>>> P1406451306&facet.field=F_****P1406451308&facet.field=S_**
>>> P1500901421&facet.field=S_****P1507138990&facet.field=I_**
>>> P1406452433&facet.field=I_****P1406453565&facet.field=I_**
>>> P1406452463&facet.field=I_****P1406453573&facet.field=I_**
>>> P1406451324&facet.field=I_****P1406451288&facet.field=S_**
>>> P1406451282&facet.field=S_****P1406452471&facet.field=S_****P14248866
>>> 05&facet.field=S_P1946367015&****facet.field=S_P1424886598&**
>>> facet.field=S_P1946367018&****facet.field=S_P1406453556&**
>>> facet.field=S_P1406389932&****facet.field=S_P2017662623&**
>>> facet.field=S_P1406450978&****facet.field=F_P1406452455&**
>>> facet.field=S_P1406389972&****facet.field=S_P1406389974&**
>>> facet.field=S_P1406389986&****facet.field=F_P1946367027&**
>>> facet.field=F_P1406451294&****facet.field=F_P1406451286&**
>>> facet.field=F_P1406451328&****facet.field=S_P1424886593&**
>>> facet.field=S_P1406453567&****facet.field=S_P2017662629&**
>>> facet.field=S_P1406453571&****facet.field=F_P1946367030&**
>>> facet.field=S_P1406453569&****facet.field=S_P2017662626&**
>>> facet.field=S_P1406389978&****facet.field=F_P1946367024
>>>
>>>
>>> My primary question here is, can Solr handle this kind of queries with so
>>> many facet fields. I have tried using both enum and fc for facet.method
>>> and
>>> there is no improvement with either.
>>>
>>> Appreciate any help on this. Thank you.
>>>
>>> - Rahul
>>>
>>>
>>> On Mon, Apr 30, 2012 at 2:53 PM, Rahul R <ra...@gmail.com> wrote:
>>>
>>>  Hello,
>>>
>>>> I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application
>>>> server on Solaris. I use embedded solr server. More details :
>>>> Number of docs in solr index : 1.4 million
>>>> Physical size of index : 640MB
>>>> Total number of fields in the index : 700 (99% of these are dynamic
>>>> fields)
>>>> Total number of fields enabled for faceting : 440
>>>> Avg number of facet fields participating in a faceted query : 50-70
>>>> Total RAM allocated to weblogic appserver : 3GB (max possible)
>>>>
>>>> In a multi user environment with 3 users using this application for a
>>>> period of around 40 minutes, the application runs out of memory.
>>>> Analysis
>>>> of the heap dump shows that almost 85% of the memory is retained by the
>>>> FieldCache. Now I understand that the field cache is out of our control
>>>> but
>>>> would appreciate some suggestions on how to handle this issue.
>>>>
>>>> Some questions on this front :
>>>> - some mail threads on this forum seem to indicate that there could be
>>>> some connection between having dynamic fields and usage of FieldCache.
>>>> Is
>>>> this true ? Most of the fields in my index are dynamic fields.
>>>> - as mentioned above, most of my faceted queries could have around 50-70
>>>> facet fields (I would do SolrQuery.addFacetField() for around 50-70
>>>> fields
>>>> per query). Could this be the source of the problem ? Is this too high
>>>> for
>>>> solr to support ?
>>>> - Initially, I had a facet.sort defined in solrconfig.xml. Since
>>>> FieldCache builds up on sorting, I even removed the facet.sort and
>>>> tried,
>>>> but no respite. The behavior is same as before.
>>>> - The document id that I have for each document is quite big (around 50
>>>> characters on average). Can this be a problem ? I reduced this to around
>>>> 15
>>>> characters and tried but still there is no improvement.
>>>> - Can the size of the data be a problem ? But on this forum, I see many
>>>> users talking of more than 100 million documents in their index. I have
>>>> only 1.4 million with physical size of 640MB. The physical server on
>>>> which
>>>> this application is running, has sufficient RAM and CPU.
>>>> - What gets stored in the FieldCache ? Is it the entire document or just
>>>> the document Id ?
>>>>
>>>>
>>>> Any help is much appreciated. Thank you.
>>>>
>>>> regards
>>>> Rahul
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Re: Lucene FieldCache - Out of memory exception

Posted by Rahul R <ra...@gmail.com>.
Jack,
Sorry for the delayed response:
Total memory allocated : 3GB
Free Memory on startup of application server : 2.85GB (95%)
Free Memory after first request by first user(1 request involves 3 queries)
: 2.7GB (90%)
Free Memory after a few requests by same user : 2.52GB (84%)

All values recorded above have been done after 2 force GCs were done to
identify the free memory.

The progression of memory usage looks quite high with the above numbers. As
the number of searches widen, the speed of memory consumption decreases.
But at some point it does hit OOM.

- Rahul

On Thu, May 3, 2012 at 8:37 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> Just for a baseline, how much memory is available in the JVM (using
> jconsole or something similar) before you do your first query, and then
> after your first query (that has these 50-70 facets), and then after a few
> different queries (different facets.) Just to see how close you are to "the
> edge" even before a volume of queries start coming in.
>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Rahul R
> Sent: Thursday, May 03, 2012 1:28 AM
>
> To: solr-user@lucene.apache.org
> Subject: Re: Lucene FieldCache - Out of memory exception
>
> Jack,
> Yes, the queries work fine till I hit the OOM. The fields that start with
> S_* are strings, F_* are floats, I_* are ints and so so. The dynamic field
> definitions from schema.xml :
> <dynamicField name="S_*" type="string"    indexed="true"  stored="true"
> omitNorms="true"/>
>  <dynamicField name="I_*" type="sint"    indexed="true"  stored="true"
> omitNorms="true"/>
>  <dynamicField name="F_*" type="sfloat"    indexed="true"  stored="true"
> omitNorms="true"/>
>  <dynamicField name="D_*" type="date"    indexed="true"  stored="true"
> omitNorms="true"/>
>  <dynamicField name="B_*" type="boolean"    indexed="true"  stored="true"
> omitNorms="true"/>
>
> *Each FieldCache will be an array with maxdoc entries (your total number of
>
> documents - 1.4 million) times the size of the field value or whatever a
> string reference is in your JVM*
>
> So if I understand correct - every field (dynamic or normal) will have its
> own field cache. The size of the field cache for any field will be (maxDocs
> * sizeOfField) ? If the field has only 100 unique values, will it occupy
> (100 * sizeOfField) or will it still be (maxDocs * sizeOfField) ?
>
> *Roughly what is the typical or average length of one of your facet field
>
> values? And, on average, how many unique terms are there within a typical
> faceted field?*
>
> Each field length may vary from 10 - 30 characters. Average of 20 maybe.
> Number of unique terms within a faceted field will vary from 100 - 1000.
> Average of 300. How will the number of unique terms affect performance ?
>
> *3 GB sounds like it might not be enough for such heavy use of faceting. It
>
> is probably not the 50-70 number, but the 440 or accumulated number across
> many queries that pushes the memory usage up*
>
> I am using jdk1.5.0_14 - 32 bit. With 32 bit jdk, I think there is a
> limitation that more RAM cannot be allocated.
>
> *When you hit OOM, what does the Solr admin stats display say for
> FieldCache?*
>
> I don't have solr deployed as a separate web app. All solr jar files are
> present in my webapp's WEB-INF\lib directory. I use EmbeddedSolrServer. So
> is there a way I can get this information that the admin would show ?
>
> Thank you for your time.
>
> -Rahul
>
>
> On Wed, May 2, 2012 at 5:19 PM, Jack Krupansky <ja...@basetechnology.com>**
> wrote:
>
>  The FieldCache gets populated the first time a given field is referenced
>> as a facet and then will stay around forever. So, as additional queries
>> get
>> executed with different facet fields, the number of FieldCache entries
>> will
>> grow.
>>
>> If I understand what you have said, theses faceted queries do work
>> initially, but after awhile they stop working with OOM, correct?
>>
>> The size of a single FieldCache depends on the field type. Since you are
>> using dynamic fields, it depends on your "dynamicField" types - which you
>> have not told us about. From your query I see that your fields start with
>> "S_" and "F_" - presumably you have dynamic field types "S_*" and "F_*"?
>> Are they strings, integers, floats, or what?
>>
>> Each FieldCache will be an array with maxdoc entries (your total number of
>> documents - 1.4 million) times the size of the field value or whatever a
>> string reference is in your JVM.
>>
>> String fields will take more space than numeric fields for the FieldCache,
>> since a separate table is maintained for the unique terms in that field.
>> Roughly what is the typical or average length of one of your facet field
>> values? And, on average, how many unique terms are there within a typical
>> faceted field?
>>
>> If you can convert many of these faceted fields to simple integers the
>> size should go down dramatically, but that depends on your application.
>>
>> 3 GB sounds like it might not be enough for such heavy use of faceting. It
>> is probably not the 50-70 number, but the 440 or accumulated number across
>> many queries that pushes the memory usage up.
>>
>> When you hit OOM, what does the Solr admin stats display say for
>> FieldCache?
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Rahul R
>> Sent: Wednesday, May 02, 2012 2:22 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Lucene FieldCache - Out of memory exception
>>
>>
>> Here is one sample query that I picked up from the log file :
>>
>> q=*%3A*&fq=Category%3A%223__****107%22&fq=S_P1540477699%3A%**
>> 22MICROCIRCUIT%2C+LINE+****TRANSCEIVERS%22&rows=0&facet=****
>> true&facet.mincount=1&facet.****limit=2&facet.field=S_**
>> C1503120369&facet.field=S_****P1406389942&facet.field=S_**
>> P1430116878&facet.field=S_****P1430116881&facet.field=S_**
>> P1406453552&facet.field=S_****P1406451296&facet.field=S_**
>> P1406452465&facet.field=S_****C2968809156&facet.field=S_**
>> P1406389980&facet.field=S_****P1540477699&facet.field=S_**
>> P1406389982&facet.field=S_****P1406389984&facet.field=S_**
>> P1406451284&facet.field=S_****P1406389926&facet.field=S_**
>> P1424886581&facet.field=S_****P2017662632&facet.field=F_**
>> P1946367021&facet.field=S_****P1430116884&facet.field=S_**
>> P2017662620&facet.field=F_****P1406451304&facet.field=F_**
>> P1406451306&facet.field=F_****P1406451308&facet.field=S_**
>> P1500901421&facet.field=S_****P1507138990&facet.field=I_**
>> P1406452433&facet.field=I_****P1406453565&facet.field=I_**
>> P1406452463&facet.field=I_****P1406453573&facet.field=I_**
>> P1406451324&facet.field=I_****P1406451288&facet.field=S_**
>> P1406451282&facet.field=S_****P1406452471&facet.field=S_****P14248866
>> 05&facet.field=S_P1946367015&****facet.field=S_P1424886598&**
>> facet.field=S_P1946367018&****facet.field=S_P1406453556&**
>> facet.field=S_P1406389932&****facet.field=S_P2017662623&**
>> facet.field=S_P1406450978&****facet.field=F_P1406452455&**
>> facet.field=S_P1406389972&****facet.field=S_P1406389974&**
>> facet.field=S_P1406389986&****facet.field=F_P1946367027&**
>> facet.field=F_P1406451294&****facet.field=F_P1406451286&**
>> facet.field=F_P1406451328&****facet.field=S_P1424886593&**
>> facet.field=S_P1406453567&****facet.field=S_P2017662629&**
>> facet.field=S_P1406453571&****facet.field=F_P1946367030&**
>> facet.field=S_P1406453569&****facet.field=S_P2017662626&**
>> facet.field=S_P1406389978&****facet.field=F_P1946367024
>>
>>
>> My primary question here is, can Solr handle this kind of queries with so
>> many facet fields. I have tried using both enum and fc for facet.method
>> and
>> there is no improvement with either.
>>
>> Appreciate any help on this. Thank you.
>>
>> - Rahul
>>
>>
>> On Mon, Apr 30, 2012 at 2:53 PM, Rahul R <ra...@gmail.com> wrote:
>>
>>  Hello,
>>
>>> I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application
>>> server on Solaris. I use embedded solr server. More details :
>>> Number of docs in solr index : 1.4 million
>>> Physical size of index : 640MB
>>> Total number of fields in the index : 700 (99% of these are dynamic
>>> fields)
>>> Total number of fields enabled for faceting : 440
>>> Avg number of facet fields participating in a faceted query : 50-70
>>> Total RAM allocated to weblogic appserver : 3GB (max possible)
>>>
>>> In a multi user environment with 3 users using this application for a
>>> period of around 40 minutes, the application runs out of memory. Analysis
>>> of the heap dump shows that almost 85% of the memory is retained by the
>>> FieldCache. Now I understand that the field cache is out of our control
>>> but
>>> would appreciate some suggestions on how to handle this issue.
>>>
>>> Some questions on this front :
>>> - some mail threads on this forum seem to indicate that there could be
>>> some connection between having dynamic fields and usage of FieldCache. Is
>>> this true ? Most of the fields in my index are dynamic fields.
>>> - as mentioned above, most of my faceted queries could have around 50-70
>>> facet fields (I would do SolrQuery.addFacetField() for around 50-70
>>> fields
>>> per query). Could this be the source of the problem ? Is this too high
>>> for
>>> solr to support ?
>>> - Initially, I had a facet.sort defined in solrconfig.xml. Since
>>> FieldCache builds up on sorting, I even removed the facet.sort and tried,
>>> but no respite. The behavior is same as before.
>>> - The document id that I have for each document is quite big (around 50
>>> characters on average). Can this be a problem ? I reduced this to around
>>> 15
>>> characters and tried but still there is no improvement.
>>> - Can the size of the data be a problem ? But on this forum, I see many
>>> users talking of more than 100 million documents in their index. I have
>>> only 1.4 million with physical size of 640MB. The physical server on
>>> which
>>> this application is running, has sufficient RAM and CPU.
>>> - What gets stored in the FieldCache ? Is it the entire document or just
>>> the document Id ?
>>>
>>>
>>> Any help is much appreciated. Thank you.
>>>
>>> regards
>>> Rahul
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Lucene FieldCache - Out of memory exception

Posted by Jack Krupansky <ja...@basetechnology.com>.
Just for a baseline, how much memory is available in the JVM (using jconsole 
or something similar) before you do your first query, and then after your 
first query (that has these 50-70 facets), and then after a few different 
queries (different facets.) Just to see how close you are to "the edge" even 
before a volume of queries start coming in.

-- Jack Krupansky

-----Original Message----- 
From: Rahul R
Sent: Thursday, May 03, 2012 1:28 AM
To: solr-user@lucene.apache.org
Subject: Re: Lucene FieldCache - Out of memory exception

Jack,
Yes, the queries work fine till I hit the OOM. The fields that start with
S_* are strings, F_* are floats, I_* are ints and so so. The dynamic field
definitions from schema.xml :
<dynamicField name="S_*" type="string"    indexed="true"  stored="true"
omitNorms="true"/>
   <dynamicField name="I_*" type="sint"    indexed="true"  stored="true"
omitNorms="true"/>
   <dynamicField name="F_*" type="sfloat"    indexed="true"  stored="true"
omitNorms="true"/>
   <dynamicField name="D_*" type="date"    indexed="true"  stored="true"
omitNorms="true"/>
   <dynamicField name="B_*" type="boolean"    indexed="true"  stored="true"
omitNorms="true"/>

*Each FieldCache will be an array with maxdoc entries (your total number of
documents - 1.4 million) times the size of the field value or whatever a
string reference is in your JVM*
So if I understand correct - every field (dynamic or normal) will have its
own field cache. The size of the field cache for any field will be (maxDocs
* sizeOfField) ? If the field has only 100 unique values, will it occupy
(100 * sizeOfField) or will it still be (maxDocs * sizeOfField) ?

*Roughly what is the typical or average length of one of your facet field
values? And, on average, how many unique terms are there within a typical
faceted field?*
Each field length may vary from 10 - 30 characters. Average of 20 maybe.
Number of unique terms within a faceted field will vary from 100 - 1000.
Average of 300. How will the number of unique terms affect performance ?

*3 GB sounds like it might not be enough for such heavy use of faceting. It
is probably not the 50-70 number, but the 440 or accumulated number across
many queries that pushes the memory usage up*
I am using jdk1.5.0_14 - 32 bit. With 32 bit jdk, I think there is a
limitation that more RAM cannot be allocated.

*When you hit OOM, what does the Solr admin stats display say for
FieldCache?*
I don't have solr deployed as a separate web app. All solr jar files are
present in my webapp's WEB-INF\lib directory. I use EmbeddedSolrServer. So
is there a way I can get this information that the admin would show ?

Thank you for your time.

-Rahul


On Wed, May 2, 2012 at 5:19 PM, Jack Krupansky 
<ja...@basetechnology.com>wrote:

> The FieldCache gets populated the first time a given field is referenced
> as a facet and then will stay around forever. So, as additional queries 
> get
> executed with different facet fields, the number of FieldCache entries 
> will
> grow.
>
> If I understand what you have said, theses faceted queries do work
> initially, but after awhile they stop working with OOM, correct?
>
> The size of a single FieldCache depends on the field type. Since you are
> using dynamic fields, it depends on your "dynamicField" types - which you
> have not told us about. From your query I see that your fields start with
> "S_" and "F_" - presumably you have dynamic field types "S_*" and "F_*"?
> Are they strings, integers, floats, or what?
>
> Each FieldCache will be an array with maxdoc entries (your total number of
> documents - 1.4 million) times the size of the field value or whatever a
> string reference is in your JVM.
>
> String fields will take more space than numeric fields for the FieldCache,
> since a separate table is maintained for the unique terms in that field.
> Roughly what is the typical or average length of one of your facet field
> values? And, on average, how many unique terms are there within a typical
> faceted field?
>
> If you can convert many of these faceted fields to simple integers the
> size should go down dramatically, but that depends on your application.
>
> 3 GB sounds like it might not be enough for such heavy use of faceting. It
> is probably not the 50-70 number, but the 440 or accumulated number across
> many queries that pushes the memory usage up.
>
> When you hit OOM, what does the Solr admin stats display say for
> FieldCache?
>
> -- Jack Krupansky
>
> -----Original Message----- From: Rahul R
> Sent: Wednesday, May 02, 2012 2:22 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Lucene FieldCache - Out of memory exception
>
>
> Here is one sample query that I picked up from the log file :
>
> q=*%3A*&fq=Category%3A%223__**107%22&fq=S_P1540477699%3A%**
> 22MICROCIRCUIT%2C+LINE+**TRANSCEIVERS%22&rows=0&facet=**
> true&facet.mincount=1&facet.**limit=2&facet.field=S_**
> C1503120369&facet.field=S_**P1406389942&facet.field=S_**
> P1430116878&facet.field=S_**P1430116881&facet.field=S_**
> P1406453552&facet.field=S_**P1406451296&facet.field=S_**
> P1406452465&facet.field=S_**C2968809156&facet.field=S_**
> P1406389980&facet.field=S_**P1540477699&facet.field=S_**
> P1406389982&facet.field=S_**P1406389984&facet.field=S_**
> P1406451284&facet.field=S_**P1406389926&facet.field=S_**
> P1424886581&facet.field=S_**P2017662632&facet.field=F_**
> P1946367021&facet.field=S_**P1430116884&facet.field=S_**
> P2017662620&facet.field=F_**P1406451304&facet.field=F_**
> P1406451306&facet.field=F_**P1406451308&facet.field=S_**
> P1500901421&facet.field=S_**P1507138990&facet.field=I_**
> P1406452433&facet.field=I_**P1406453565&facet.field=I_**
> P1406452463&facet.field=I_**P1406453573&facet.field=I_**
> P1406451324&facet.field=I_**P1406451288&facet.field=S_**
> P1406451282&facet.field=S_**P1406452471&facet.field=S_**P14248866
> 05&facet.field=S_P1946367015&**facet.field=S_P1424886598&**
> facet.field=S_P1946367018&**facet.field=S_P1406453556&**
> facet.field=S_P1406389932&**facet.field=S_P2017662623&**
> facet.field=S_P1406450978&**facet.field=F_P1406452455&**
> facet.field=S_P1406389972&**facet.field=S_P1406389974&**
> facet.field=S_P1406389986&**facet.field=F_P1946367027&**
> facet.field=F_P1406451294&**facet.field=F_P1406451286&**
> facet.field=F_P1406451328&**facet.field=S_P1424886593&**
> facet.field=S_P1406453567&**facet.field=S_P2017662629&**
> facet.field=S_P1406453571&**facet.field=F_P1946367030&**
> facet.field=S_P1406453569&**facet.field=S_P2017662626&**
> facet.field=S_P1406389978&**facet.field=F_P1946367024
>
> My primary question here is, can Solr handle this kind of queries with so
> many facet fields. I have tried using both enum and fc for facet.method 
> and
> there is no improvement with either.
>
> Appreciate any help on this. Thank you.
>
> - Rahul
>
>
> On Mon, Apr 30, 2012 at 2:53 PM, Rahul R <ra...@gmail.com> wrote:
>
>  Hello,
>> I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application
>> server on Solaris. I use embedded solr server. More details :
>> Number of docs in solr index : 1.4 million
>> Physical size of index : 640MB
>> Total number of fields in the index : 700 (99% of these are dynamic
>> fields)
>> Total number of fields enabled for faceting : 440
>> Avg number of facet fields participating in a faceted query : 50-70
>> Total RAM allocated to weblogic appserver : 3GB (max possible)
>>
>> In a multi user environment with 3 users using this application for a
>> period of around 40 minutes, the application runs out of memory. Analysis
>> of the heap dump shows that almost 85% of the memory is retained by the
>> FieldCache. Now I understand that the field cache is out of our control
>> but
>> would appreciate some suggestions on how to handle this issue.
>>
>> Some questions on this front :
>> - some mail threads on this forum seem to indicate that there could be
>> some connection between having dynamic fields and usage of FieldCache. Is
>> this true ? Most of the fields in my index are dynamic fields.
>> - as mentioned above, most of my faceted queries could have around 50-70
>> facet fields (I would do SolrQuery.addFacetField() for around 50-70 
>> fields
>> per query). Could this be the source of the problem ? Is this too high 
>> for
>> solr to support ?
>> - Initially, I had a facet.sort defined in solrconfig.xml. Since
>> FieldCache builds up on sorting, I even removed the facet.sort and tried,
>> but no respite. The behavior is same as before.
>> - The document id that I have for each document is quite big (around 50
>> characters on average). Can this be a problem ? I reduced this to around
>> 15
>> characters and tried but still there is no improvement.
>> - Can the size of the data be a problem ? But on this forum, I see many
>> users talking of more than 100 million documents in their index. I have
>> only 1.4 million with physical size of 640MB. The physical server on 
>> which
>> this application is running, has sufficient RAM and CPU.
>> - What gets stored in the FieldCache ? Is it the entire document or just
>> the document Id ?
>>
>>
>> Any help is much appreciated. Thank you.
>>
>> regards
>> Rahul
>>
>>
>>
>>
> 


Re: Lucene FieldCache - Out of memory exception

Posted by Rahul R <ra...@gmail.com>.
Jack,
Yes, the queries work fine till I hit the OOM. The fields that start with
S_* are strings, F_* are floats, I_* are ints and so so. The dynamic field
definitions from schema.xml :
 <dynamicField name="S_*" type="string"    indexed="true"  stored="true"
omitNorms="true"/>
   <dynamicField name="I_*" type="sint"    indexed="true"  stored="true"
omitNorms="true"/>
   <dynamicField name="F_*" type="sfloat"    indexed="true"  stored="true"
omitNorms="true"/>
   <dynamicField name="D_*" type="date"    indexed="true"  stored="true"
omitNorms="true"/>
   <dynamicField name="B_*" type="boolean"    indexed="true"  stored="true"
omitNorms="true"/>

*Each FieldCache will be an array with maxdoc entries (your total number of
documents - 1.4 million) times the size of the field value or whatever a
string reference is in your JVM*
So if I understand correct - every field (dynamic or normal) will have its
own field cache. The size of the field cache for any field will be (maxDocs
* sizeOfField) ? If the field has only 100 unique values, will it occupy
(100 * sizeOfField) or will it still be (maxDocs * sizeOfField) ?

*Roughly what is the typical or average length of one of your facet field
values? And, on average, how many unique terms are there within a typical
faceted field?*
Each field length may vary from 10 - 30 characters. Average of 20 maybe.
Number of unique terms within a faceted field will vary from 100 - 1000.
Average of 300. How will the number of unique terms affect performance ?

*3 GB sounds like it might not be enough for such heavy use of faceting. It
is probably not the 50-70 number, but the 440 or accumulated number across
many queries that pushes the memory usage up*
I am using jdk1.5.0_14 - 32 bit. With 32 bit jdk, I think there is a
limitation that more RAM cannot be allocated.

*When you hit OOM, what does the Solr admin stats display say for
FieldCache?*
I don't have solr deployed as a separate web app. All solr jar files are
present in my webapp's WEB-INF\lib directory. I use EmbeddedSolrServer. So
is there a way I can get this information that the admin would show ?

Thank you for your time.

-Rahul


On Wed, May 2, 2012 at 5:19 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> The FieldCache gets populated the first time a given field is referenced
> as a facet and then will stay around forever. So, as additional queries get
> executed with different facet fields, the number of FieldCache entries will
> grow.
>
> If I understand what you have said, theses faceted queries do work
> initially, but after awhile they stop working with OOM, correct?
>
> The size of a single FieldCache depends on the field type. Since you are
> using dynamic fields, it depends on your "dynamicField" types - which you
> have not told us about. From your query I see that your fields start with
> "S_" and "F_" - presumably you have dynamic field types "S_*" and "F_*"?
> Are they strings, integers, floats, or what?
>
> Each FieldCache will be an array with maxdoc entries (your total number of
> documents - 1.4 million) times the size of the field value or whatever a
> string reference is in your JVM.
>
> String fields will take more space than numeric fields for the FieldCache,
> since a separate table is maintained for the unique terms in that field.
> Roughly what is the typical or average length of one of your facet field
> values? And, on average, how many unique terms are there within a typical
> faceted field?
>
> If you can convert many of these faceted fields to simple integers the
> size should go down dramatically, but that depends on your application.
>
> 3 GB sounds like it might not be enough for such heavy use of faceting. It
> is probably not the 50-70 number, but the 440 or accumulated number across
> many queries that pushes the memory usage up.
>
> When you hit OOM, what does the Solr admin stats display say for
> FieldCache?
>
> -- Jack Krupansky
>
> -----Original Message----- From: Rahul R
> Sent: Wednesday, May 02, 2012 2:22 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Lucene FieldCache - Out of memory exception
>
>
> Here is one sample query that I picked up from the log file :
>
> q=*%3A*&fq=Category%3A%223__**107%22&fq=S_P1540477699%3A%**
> 22MICROCIRCUIT%2C+LINE+**TRANSCEIVERS%22&rows=0&facet=**
> true&facet.mincount=1&facet.**limit=2&facet.field=S_**
> C1503120369&facet.field=S_**P1406389942&facet.field=S_**
> P1430116878&facet.field=S_**P1430116881&facet.field=S_**
> P1406453552&facet.field=S_**P1406451296&facet.field=S_**
> P1406452465&facet.field=S_**C2968809156&facet.field=S_**
> P1406389980&facet.field=S_**P1540477699&facet.field=S_**
> P1406389982&facet.field=S_**P1406389984&facet.field=S_**
> P1406451284&facet.field=S_**P1406389926&facet.field=S_**
> P1424886581&facet.field=S_**P2017662632&facet.field=F_**
> P1946367021&facet.field=S_**P1430116884&facet.field=S_**
> P2017662620&facet.field=F_**P1406451304&facet.field=F_**
> P1406451306&facet.field=F_**P1406451308&facet.field=S_**
> P1500901421&facet.field=S_**P1507138990&facet.field=I_**
> P1406452433&facet.field=I_**P1406453565&facet.field=I_**
> P1406452463&facet.field=I_**P1406453573&facet.field=I_**
> P1406451324&facet.field=I_**P1406451288&facet.field=S_**
> P1406451282&facet.field=S_**P1406452471&facet.field=S_**P14248866
> 05&facet.field=S_P1946367015&**facet.field=S_P1424886598&**
> facet.field=S_P1946367018&**facet.field=S_P1406453556&**
> facet.field=S_P1406389932&**facet.field=S_P2017662623&**
> facet.field=S_P1406450978&**facet.field=F_P1406452455&**
> facet.field=S_P1406389972&**facet.field=S_P1406389974&**
> facet.field=S_P1406389986&**facet.field=F_P1946367027&**
> facet.field=F_P1406451294&**facet.field=F_P1406451286&**
> facet.field=F_P1406451328&**facet.field=S_P1424886593&**
> facet.field=S_P1406453567&**facet.field=S_P2017662629&**
> facet.field=S_P1406453571&**facet.field=F_P1946367030&**
> facet.field=S_P1406453569&**facet.field=S_P2017662626&**
> facet.field=S_P1406389978&**facet.field=F_P1946367024
>
> My primary question here is, can Solr handle this kind of queries with so
> many facet fields. I have tried using both enum and fc for facet.method and
> there is no improvement with either.
>
> Appreciate any help on this. Thank you.
>
> - Rahul
>
>
> On Mon, Apr 30, 2012 at 2:53 PM, Rahul R <ra...@gmail.com> wrote:
>
>  Hello,
>> I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application
>> server on Solaris. I use embedded solr server. More details :
>> Number of docs in solr index : 1.4 million
>> Physical size of index : 640MB
>> Total number of fields in the index : 700 (99% of these are dynamic
>> fields)
>> Total number of fields enabled for faceting : 440
>> Avg number of facet fields participating in a faceted query : 50-70
>> Total RAM allocated to weblogic appserver : 3GB (max possible)
>>
>> In a multi user environment with 3 users using this application for a
>> period of around 40 minutes, the application runs out of memory. Analysis
>> of the heap dump shows that almost 85% of the memory is retained by the
>> FieldCache. Now I understand that the field cache is out of our control
>> but
>> would appreciate some suggestions on how to handle this issue.
>>
>> Some questions on this front :
>> - some mail threads on this forum seem to indicate that there could be
>> some connection between having dynamic fields and usage of FieldCache. Is
>> this true ? Most of the fields in my index are dynamic fields.
>> - as mentioned above, most of my faceted queries could have around 50-70
>> facet fields (I would do SolrQuery.addFacetField() for around 50-70 fields
>> per query). Could this be the source of the problem ? Is this too high for
>> solr to support ?
>> - Initially, I had a facet.sort defined in solrconfig.xml. Since
>> FieldCache builds up on sorting, I even removed the facet.sort and tried,
>> but no respite. The behavior is same as before.
>> - The document id that I have for each document is quite big (around 50
>> characters on average). Can this be a problem ? I reduced this to around
>> 15
>> characters and tried but still there is no improvement.
>> - Can the size of the data be a problem ? But on this forum, I see many
>> users talking of more than 100 million documents in their index. I have
>> only 1.4 million with physical size of 640MB. The physical server on which
>> this application is running, has sufficient RAM and CPU.
>> - What gets stored in the FieldCache ? Is it the entire document or just
>> the document Id ?
>>
>>
>> Any help is much appreciated. Thank you.
>>
>> regards
>> Rahul
>>
>>
>>
>>
>

Re: Lucene FieldCache - Out of memory exception

Posted by Jack Krupansky <ja...@basetechnology.com>.
The FieldCache gets populated the first time a given field is referenced as 
a facet and then will stay around forever. So, as additional queries get 
executed with different facet fields, the number of FieldCache entries will 
grow.

If I understand what you have said, theses faceted queries do work 
initially, but after awhile they stop working with OOM, correct?

The size of a single FieldCache depends on the field type. Since you are 
using dynamic fields, it depends on your "dynamicField" types - which you 
have not told us about. From your query I see that your fields start with 
"S_" and "F_" - presumably you have dynamic field types "S_*" and "F_*"? Are 
they strings, integers, floats, or what?

Each FieldCache will be an array with maxdoc entries (your total number of 
documents - 1.4 million) times the size of the field value or whatever a 
string reference is in your JVM.

String fields will take more space than numeric fields for the FieldCache, 
since a separate table is maintained for the unique terms in that field. 
Roughly what is the typical or average length of one of your facet field 
values? And, on average, how many unique terms are there within a typical 
faceted field?

If you can convert many of these faceted fields to simple integers the size 
should go down dramatically, but that depends on your application.

3 GB sounds like it might not be enough for such heavy use of faceting. It 
is probably not the 50-70 number, but the 440 or accumulated number across 
many queries that pushes the memory usage up.

When you hit OOM, what does the Solr admin stats display say for FieldCache?

-- Jack Krupansky

-----Original Message----- 
From: Rahul R
Sent: Wednesday, May 02, 2012 2:22 AM
To: solr-user@lucene.apache.org
Subject: Re: Lucene FieldCache - Out of memory exception

Here is one sample query that I picked up from the log file :

q=*%3A*&fq=Category%3A%223__107%22&fq=S_P1540477699%3A%22MICROCIRCUIT%2C+LINE+TRANSCEIVERS%22&rows=0&facet=true&facet.mincount=1&facet.limit=2&facet.field=S_C1503120369&facet.field=S_P1406389942&facet.field=S_P1430116878&facet.field=S_P1430116881&facet.field=S_P1406453552&facet.field=S_P1406451296&facet.field=S_P1406452465&facet.field=S_C2968809156&facet.field=S_P1406389980&facet.field=S_P1540477699&facet.field=S_P1406389982&facet.field=S_P1406389984&facet.field=S_P1406451284&facet.field=S_P1406389926&facet.field=S_P1424886581&facet.field=S_P2017662632&facet.field=F_P1946367021&facet.field=S_P1430116884&facet.field=S_P2017662620&facet.field=F_P1406451304&facet.field=F_P1406451306&facet.field=F_P1406451308&facet.field=S_P1500901421&facet.field=S_P1507138990&facet.field=I_P1406452433&facet.field=I_P1406453565&facet.field=I_P1406452463&facet.field=I_P1406453573&facet.field=I_P1406451324&facet.field=I_P1406451288&facet.field=S_P1406451282&facet.field=S_P1406452471&facet.field=S_P14248866
 05&facet.field=S_P1946367015&facet.field=S_P1424886598&facet.field=S_P1946367018&facet.field=S_P1406453556&facet.field=S_P1406389932&facet.field=S_P2017662623&facet.field=S_P1406450978&facet.field=F_P1406452455&facet.field=S_P1406389972&facet.field=S_P1406389974&facet.field=S_P1406389986&facet.field=F_P1946367027&facet.field=F_P1406451294&facet.field=F_P1406451286&facet.field=F_P1406451328&facet.field=S_P1424886593&facet.field=S_P1406453567&facet.field=S_P2017662629&facet.field=S_P1406453571&facet.field=F_P1946367030&facet.field=S_P1406453569&facet.field=S_P2017662626&facet.field=S_P1406389978&facet.field=F_P1946367024

My primary question here is, can Solr handle this kind of queries with so
many facet fields. I have tried using both enum and fc for facet.method and
there is no improvement with either.

Appreciate any help on this. Thank you.

- Rahul


On Mon, Apr 30, 2012 at 2:53 PM, Rahul R <ra...@gmail.com> wrote:

> Hello,
> I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application
> server on Solaris. I use embedded solr server. More details :
> Number of docs in solr index : 1.4 million
> Physical size of index : 640MB
> Total number of fields in the index : 700 (99% of these are dynamic 
> fields)
> Total number of fields enabled for faceting : 440
> Avg number of facet fields participating in a faceted query : 50-70
> Total RAM allocated to weblogic appserver : 3GB (max possible)
>
> In a multi user environment with 3 users using this application for a
> period of around 40 minutes, the application runs out of memory. Analysis
> of the heap dump shows that almost 85% of the memory is retained by the
> FieldCache. Now I understand that the field cache is out of our control 
> but
> would appreciate some suggestions on how to handle this issue.
>
> Some questions on this front :
> - some mail threads on this forum seem to indicate that there could be
> some connection between having dynamic fields and usage of FieldCache. Is
> this true ? Most of the fields in my index are dynamic fields.
> - as mentioned above, most of my faceted queries could have around 50-70
> facet fields (I would do SolrQuery.addFacetField() for around 50-70 fields
> per query). Could this be the source of the problem ? Is this too high for
> solr to support ?
> - Initially, I had a facet.sort defined in solrconfig.xml. Since
> FieldCache builds up on sorting, I even removed the facet.sort and tried,
> but no respite. The behavior is same as before.
> - The document id that I have for each document is quite big (around 50
> characters on average). Can this be a problem ? I reduced this to around 
> 15
> characters and tried but still there is no improvement.
> - Can the size of the data be a problem ? But on this forum, I see many
> users talking of more than 100 million documents in their index. I have
> only 1.4 million with physical size of 640MB. The physical server on which
> this application is running, has sufficient RAM and CPU.
> - What gets stored in the FieldCache ? Is it the entire document or just
> the document Id ?
>
>
> Any help is much appreciated. Thank you.
>
> regards
> Rahul
>
>
> 


Re: Lucene FieldCache - Out of memory exception

Posted by Rahul R <ra...@gmail.com>.
Here is one sample query that I picked up from the log file :

q=*%3A*&fq=Category%3A%223__107%22&fq=S_P1540477699%3A%22MICROCIRCUIT%2C+LINE+TRANSCEIVERS%22&rows=0&facet=true&facet.mincount=1&facet.limit=2&facet.field=S_C1503120369&facet.field=S_P1406389942&facet.field=S_P1430116878&facet.field=S_P1430116881&facet.field=S_P1406453552&facet.field=S_P1406451296&facet.field=S_P1406452465&facet.field=S_C2968809156&facet.field=S_P1406389980&facet.field=S_P1540477699&facet.field=S_P1406389982&facet.field=S_P1406389984&facet.field=S_P1406451284&facet.field=S_P1406389926&facet.field=S_P1424886581&facet.field=S_P2017662632&facet.field=F_P1946367021&facet.field=S_P1430116884&facet.field=S_P2017662620&facet.field=F_P1406451304&facet.field=F_P1406451306&facet.field=F_P1406451308&facet.field=S_P1500901421&facet.field=S_P1507138990&facet.field=I_P1406452433&facet.field=I_P1406453565&facet.field=I_P1406452463&facet.field=I_P1406453573&facet.field=I_P1406451324&facet.field=I_P1406451288&facet.field=S_P1406451282&facet.field=S_P1406452471&facet.field=S_P1424886605&facet.field=S_P1946367015&facet.field=S_P1424886598&facet.field=S_P1946367018&facet.field=S_P1406453556&facet.field=S_P1406389932&facet.field=S_P2017662623&facet.field=S_P1406450978&facet.field=F_P1406452455&facet.field=S_P1406389972&facet.field=S_P1406389974&facet.field=S_P1406389986&facet.field=F_P1946367027&facet.field=F_P1406451294&facet.field=F_P1406451286&facet.field=F_P1406451328&facet.field=S_P1424886593&facet.field=S_P1406453567&facet.field=S_P2017662629&facet.field=S_P1406453571&facet.field=F_P1946367030&facet.field=S_P1406453569&facet.field=S_P2017662626&facet.field=S_P1406389978&facet.field=F_P1946367024

My primary question here is, can Solr handle this kind of queries with so
many facet fields. I have tried using both enum and fc for facet.method and
there is no improvement with either.

Appreciate any help on this. Thank you.

- Rahul


On Mon, Apr 30, 2012 at 2:53 PM, Rahul R <ra...@gmail.com> wrote:

> Hello,
> I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application
> server on Solaris. I use embedded solr server. More details :
> Number of docs in solr index : 1.4 million
> Physical size of index : 640MB
> Total number of fields in the index : 700 (99% of these are dynamic fields)
> Total number of fields enabled for faceting : 440
> Avg number of facet fields participating in a faceted query : 50-70
> Total RAM allocated to weblogic appserver : 3GB (max possible)
>
> In a multi user environment with 3 users using this application for a
> period of around 40 minutes, the application runs out of memory. Analysis
> of the heap dump shows that almost 85% of the memory is retained by the
> FieldCache. Now I understand that the field cache is out of our control but
> would appreciate some suggestions on how to handle this issue.
>
> Some questions on this front :
> - some mail threads on this forum seem to indicate that there could be
> some connection between having dynamic fields and usage of FieldCache. Is
> this true ? Most of the fields in my index are dynamic fields.
> - as mentioned above, most of my faceted queries could have around 50-70
> facet fields (I would do SolrQuery.addFacetField() for around 50-70 fields
> per query). Could this be the source of the problem ? Is this too high for
> solr to support ?
> - Initially, I had a facet.sort defined in solrconfig.xml. Since
> FieldCache builds up on sorting, I even removed the facet.sort and tried,
> but no respite. The behavior is same as before.
> - The document id that I have for each document is quite big (around 50
> characters on average). Can this be a problem ? I reduced this to around 15
> characters and tried but still there is no improvement.
> - Can the size of the data be a problem ? But on this forum, I see many
> users talking of more than 100 million documents in their index. I have
> only 1.4 million with physical size of 640MB. The physical server on which
> this application is running, has sufficient RAM and CPU.
> - What gets stored in the FieldCache ? Is it the entire document or just
> the document Id ?
>
>
> Any help is much appreciated. Thank you.
>
> regards
> Rahul
>
>
>