You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rahul R <ra...@gmail.com> on 2009/08/04 07:09:06 UTC

JVM Heap utilization & Memory leaks with Solr

I am trying to track memory utilization with my Application that uses Solr.
Details of the setup :
 -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
- Hardware : 12 CPU, 24 GB RAM

For testing during PSR I am using a smaller subset of the actual data that I
want to work with. Details of this smaller sub-set :
- 5 million records, 4.5 GB index size

Observations during PSR:
A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
logout and doing a force GC, only 60 % of the heap is reclaimed. As part of
the logout process I am invalidating the HttpSession and doing a close() on
CoreContainer. From my application's side, I don't believe I am holding on
to any resource. I wanted to know if there are known issues surrounding
memory leaks with Solr ?
B) To further test this, I tried deploying with shards. 3.2 GB was allocated
to each JVM. All JVMs had 96 % free heap space after start up. I got varying
results with this.
Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain.
I split the 5 million index into 5 parts of 1 million each and used them as
shards. After multiple users used the system and doing a force GC, around 94
- 96 % of heap was reclaimed in all the JVMs.
Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On
the other, I deployed the entire 5 million part index as one shard. After
multiple users used the system and doing a gorce GC, around 76 % of the heap
was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my
application was running. This result further convinces me that my
application can be absolved of holding on to memory resources.

I am not sure how to interpret these results ? For searching, I am using
Without Shards : EmbeddedSolrServer
With Shards :CommonsHttpSolrServer
In terms of Solr objects this is what differs in my code between normal
search and shards search (distributed search)

After looking at Case 1, I thought that the CommonsHttpSolrServer was more
memory efficient but Case 2 proved me wrong. Or could there still be memory
leaks in my application ? Any thoughts, suggestions would be welcome.

Regards
Rahul

Re: JVM Heap utilization & Memory leaks with Solr

Posted by Funtick <fu...@efendi.ca>.
Can you tell me please how many non-tokenized single-valued fields your
schema uses, and how many documents?
Thanks,
Fuad


Rahul R wrote:
> 
> My primary issue is not Out of Memory error at run time. It is memory
> leaks:
> heap space not being released after doing a force GC also. So after
> sometime
> as progressively more heap gets utilized, I start running out of
> memory....
> The verdict however seems unanimous that there are no known memory leak
> issues within Solr. I am still looking at my application to analyse the
> problem. Thank you.
> 
> On Thu, Aug 13, 2009 at 10:58 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> 
>> Most OutOfMemoryException (if not 100%) happening with SOLR are because
>> of
>>
>> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/FieldCache.
>> html
>> - it is used internally in Lucene to cache Field value and document ID.
>>
>> My very long-term observations: SOLR can run without any problems few
>> days/months and unpredictable OOM happens just because someone tried
>> sorted
>> search which will populate array with IDs of ALL documents in the index.
>>
>> The only solution: calculate exactly amount of RAM needed for
>> FieldCache...
>> For instance, for 100,000,000 documents single instance of FieldCache may
>> require 8*100,000,000 bytes (8 bytes per document ID?) which is almost
>> 1Gb
>> (at least!)
>>
>>
>> I didn't notice any memory leaks after I started to use 16Gb RAM for SOLR
>> instance (almost a year without any restart!)
>>
>>
>>
>>
>> -----Original Message-----
>> From: Rahul R [mailto:rahul.solr@gmail.com]
>> Sent: August-13-09 1:25 AM
>> To: solr-user@lucene.apache.org
>>  Subject: Re: JVM Heap utilization & Memory leaks with Solr
>>
>> *You should try to generate heap dumps and analyze the heap using a tool
>> like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
>> objects holding a large amount of memory*
>>
>> The tool that I used also allows to capture heap snap shots. Eclipse had
>> a
>> lot of pre-requisites. You need to apply some three or five patches
>> before
>> you can start using it........ My observations with this tool were that
>> some
>> Hashmaps were taking up a lot of space. Although I could not pin it down
>> to
>> the exact HashMap. These would either be weblogic's or Solr's.... I will
>> anyway give eclipse's a try and see how it goes. Thanks for your input.
>>
>> Rahul
>>
>> On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht
>> <gu...@wagenknecht.org>wrote:
>>
>> > Rahul R schrieb:
>> > > I tried using a profiling tool - Yourkit. The trial version was free
>> for
>> > 15
>> > > days. But I couldn't find anything of significance.
>> >
>> > You should try to generate heap dumps and analyze the heap using a tool
>> > like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
>> > objects holding a large amount of memory.
>> >
>> > -Gunnar
>> >
>> > --
>> > Gunnar Wagenknecht
>> > gunnar@wagenknecht.org
>> > http://wagenknecht.org/
>> >
>> >
>>
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp24802380p25017767.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: JVM Heap utilization & Memory leaks with Solr

Posted by Rahul R <ra...@gmail.com>.
My primary issue is not Out of Memory error at run time. It is memory leaks:
heap space not being released after doing a force GC also. So after sometime
as progressively more heap gets utilized, I start running out of memory....
The verdict however seems unanimous that there are no known memory leak
issues within Solr. I am still looking at my application to analyse the
problem. Thank you.

On Thu, Aug 13, 2009 at 10:58 PM, Fuad Efendi <fu...@efendi.ca> wrote:

> Most OutOfMemoryException (if not 100%) happening with SOLR are because of
>
> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/FieldCache.
> html
> - it is used internally in Lucene to cache Field value and document ID.
>
> My very long-term observations: SOLR can run without any problems few
> days/months and unpredictable OOM happens just because someone tried sorted
> search which will populate array with IDs of ALL documents in the index.
>
> The only solution: calculate exactly amount of RAM needed for FieldCache...
> For instance, for 100,000,000 documents single instance of FieldCache may
> require 8*100,000,000 bytes (8 bytes per document ID?) which is almost 1Gb
> (at least!)
>
>
> I didn't notice any memory leaks after I started to use 16Gb RAM for SOLR
> instance (almost a year without any restart!)
>
>
>
>
> -----Original Message-----
> From: Rahul R [mailto:rahul.solr@gmail.com]
> Sent: August-13-09 1:25 AM
> To: solr-user@lucene.apache.org
>  Subject: Re: JVM Heap utilization & Memory leaks with Solr
>
> *You should try to generate heap dumps and analyze the heap using a tool
> like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
> objects holding a large amount of memory*
>
> The tool that I used also allows to capture heap snap shots. Eclipse had a
> lot of pre-requisites. You need to apply some three or five patches before
> you can start using it........ My observations with this tool were that
> some
> Hashmaps were taking up a lot of space. Although I could not pin it down to
> the exact HashMap. These would either be weblogic's or Solr's.... I will
> anyway give eclipse's a try and see how it goes. Thanks for your input.
>
> Rahul
>
> On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht
> <gu...@wagenknecht.org>wrote:
>
> > Rahul R schrieb:
> > > I tried using a profiling tool - Yourkit. The trial version was free
> for
> > 15
> > > days. But I couldn't find anything of significance.
> >
> > You should try to generate heap dumps and analyze the heap using a tool
> > like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
> > objects holding a large amount of memory.
> >
> > -Gunnar
> >
> > --
> > Gunnar Wagenknecht
> > gunnar@wagenknecht.org
> > http://wagenknecht.org/
> >
> >
>
>
>

RE: JVM Heap utilization & Memory leaks with Solr

Posted by Fuad Efendi <fu...@efendi.ca>.
Most OutOfMemoryException (if not 100%) happening with SOLR are because of
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/FieldCache.
html
- it is used internally in Lucene to cache Field value and document ID. 

My very long-term observations: SOLR can run without any problems few
days/months and unpredictable OOM happens just because someone tried sorted
search which will populate array with IDs of ALL documents in the index.

The only solution: calculate exactly amount of RAM needed for FieldCache...
For instance, for 100,000,000 documents single instance of FieldCache may
require 8*100,000,000 bytes (8 bytes per document ID?) which is almost 1Gb
(at least!)


I didn't notice any memory leaks after I started to use 16Gb RAM for SOLR
instance (almost a year without any restart!)




-----Original Message-----
From: Rahul R [mailto:rahul.solr@gmail.com] 
Sent: August-13-09 1:25 AM
To: solr-user@lucene.apache.org
Subject: Re: JVM Heap utilization & Memory leaks with Solr

*You should try to generate heap dumps and analyze the heap using a tool
like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
objects holding a large amount of memory*

The tool that I used also allows to capture heap snap shots. Eclipse had a
lot of pre-requisites. You need to apply some three or five patches before
you can start using it........ My observations with this tool were that some
Hashmaps were taking up a lot of space. Although I could not pin it down to
the exact HashMap. These would either be weblogic's or Solr's.... I will
anyway give eclipse's a try and see how it goes. Thanks for your input.

Rahul

On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht
<gu...@wagenknecht.org>wrote:

> Rahul R schrieb:
> > I tried using a profiling tool - Yourkit. The trial version was free for
> 15
> > days. But I couldn't find anything of significance.
>
> You should try to generate heap dumps and analyze the heap using a tool
> like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
> objects holding a large amount of memory.
>
> -Gunnar
>
> --
> Gunnar Wagenknecht
> gunnar@wagenknecht.org
> http://wagenknecht.org/
>
>



Re: JVM Heap utilization & Memory leaks with Solr

Posted by Rahul R <ra...@gmail.com>.
*You should try to generate heap dumps and analyze the heap using a tool
like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
objects holding a large amount of memory*

The tool that I used also allows to capture heap snap shots. Eclipse had a
lot of pre-requisites. You need to apply some three or five patches before
you can start using it........ My observations with this tool were that some
Hashmaps were taking up a lot of space. Although I could not pin it down to
the exact HashMap. These would either be weblogic's or Solr's.... I will
anyway give eclipse's a try and see how it goes. Thanks for your input.

Rahul

On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht
<gu...@wagenknecht.org>wrote:

> Rahul R schrieb:
> > I tried using a profiling tool - Yourkit. The trial version was free for
> 15
> > days. But I couldn't find anything of significance.
>
> You should try to generate heap dumps and analyze the heap using a tool
> like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
> objects holding a large amount of memory.
>
> -Gunnar
>
> --
> Gunnar Wagenknecht
> gunnar@wagenknecht.org
> http://wagenknecht.org/
>
>

Re: JVM Heap utilization & Memory leaks with Solr

Posted by Gunnar Wagenknecht <gu...@wagenknecht.org>.
Rahul R schrieb:
> I tried using a profiling tool - Yourkit. The trial version was free for 15
> days. But I couldn't find anything of significance.

You should try to generate heap dumps and analyze the heap using a tool
like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
objects holding a large amount of memory.

-Gunnar

-- 
Gunnar Wagenknecht
gunnar@wagenknecht.org
http://wagenknecht.org/


Re: JVM Heap utilization & Memory leaks with Solr

Posted by Rahul R <ra...@gmail.com>.
All these 3700 fields are single valued non-boolean fields. Thanks

Regards
Rahul

On Wed, Aug 19, 2009 at 8:33 PM, Fuad Efendi <fu...@efendi.ca> wrote:

>
> Hi Rahul,
>
> JRockit could be used at least in a test environment to monitor JVM (and
> troubleshoot SOLR, licensed for-free for developers!); they have even
> Eclipse plugin now, and it is licensed by Oracle (BEA)... But, of course,
> in
> large companies test environment is in hands of testers :)
>
>
> But... 3700 fields will create (over time) 3700 arrays  each of size
> 5,000,000!!! Even if most of fields are empty for most of documents...
> Applicable to non-tokenized single-valued non-boolean fields only, Lucene
> internals, FieldCache... and it won't be GC-collected after user log-off...
> prefer dedicated box for SOLR.
>
> -Fuad
>
>
> -----Original Message-----
> From: Rahul R [mailto:rahul.solr@gmail.com]
> Sent: August-19-09 6:19 AM
> To: solr-user@lucene.apache.org
>  Subject: Re: JVM Heap utilization & Memory leaks with Solr
>
> Fuad,
> We have around 5 million documents and around 3700 fields. All documents
> will not have values for all the fields.... JRockit is not approved for use
> within my organization. But thanks for the info anyway.
>
> Regards
> Rahul
>
> On Tue, Aug 18, 2009 at 9:41 AM, Funtick <fu...@efendi.ca> wrote:
>
> >
> > BTW, you should really prefer JRockit which really rocks!!!
> >
> > "Mission Control" has necessary toolongs; and JRockit produces _nice_
> > exception stacktrace (explaining almost everything) in case of even OOM
> > which SUN JVN still fails to produce.
> >
> >
> > SolrServlet still catches "Throwable":
> >
> >    } catch (Throwable e) {
> >      SolrException.log(log,e);
> >      sendErr(500, SolrException.toStr(e), request, response);
> >    } finally {
> >
> >
> >
> >
> >
> > Rahul R wrote:
> > >
> > > Otis,
> > > Thank you for your response. I know there are a few variables here but
> > the
> > > difference in memory utilization with and without shards somehow leads
> me
> > > to
> > > believe that the leak could be within Solr.
> > >
> > > I tried using a profiling tool - Yourkit. The trial version was free
> for
> > > 15
> > > days. But I couldn't find anything of significance.
> > >
> > > Regards
> > > Rahul
> > >
> > >
> > > On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic
> > > <otis_gospodnetic@yahoo.com
> > >> wrote:
> > >
> > >> Hi Rahul,
> > >>
> > >> A) There are no known (to me) memory leaks.
> > >> I think there are too many variables for a person to tell you what
> > >> exactly
> > >> is happening, plus you are dealing with the JVM here. :)
> > >>
> > >> Try jmap -histo:live PID-HERE | less and see what's using your memory.
> > >>
> > >> Otis
> > >> --
> > >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> > >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> > >>
> > >>
> > >>
> > >> ----- Original Message ----
> > >> > From: Rahul R <ra...@gmail.com>
> > >> > To: solr-user@lucene.apache.org
> > >> > Sent: Tuesday, August 4, 2009 1:09:06 AM
> > >> > Subject: JVM Heap utilization & Memory leaks with Solr
> > >> >
> > >> > I am trying to track memory utilization with my Application that
> uses
> > >> Solr.
> > >> > Details of the setup :
> > >> > -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr
> 1.3.0
> > >> > - Hardware : 12 CPU, 24 GB RAM
> > >> >
> > >> > For testing during PSR I am using a smaller subset of the actual
> data
> > >> that I
> > >> > want to work with. Details of this smaller sub-set :
> > >> > - 5 million records, 4.5 GB index size
> > >> >
> > >> > Observations during PSR:
> > >> > A) I have allocated 3.2 GB for the JVM(s) that I used. After all
> users
> > >> > logout and doing a force GC, only 60 % of the heap is reclaimed. As
> > >> part
> > >> of
> > >> > the logout process I am invalidating the HttpSession and doing a
> > >> close()
> > >> on
> > >> > CoreContainer. From my application's side, I don't believe I am
> > holding
> > >> on
> > >> > to any resource. I wanted to know if there are known issues
> > surrounding
> > >> > memory leaks with Solr ?
> > >> > B) To further test this, I tried deploying with shards. 3.2 GB was
> > >> allocated
> > >> > to each JVM. All JVMs had 96 % free heap space after start up. I got
> > >> varying
> > >> > results with this.
> > >> > Case 1 : Used 6 weblogic domains. My application was deployed one 1
> > >> domain.
> > >> > I split the 5 million index into 5 parts of 1 million each and used
> > >> them
> > >> as
> > >> > shards. After multiple users used the system and doing a force GC,
> > >> around
> > >> 94
> > >> > - 96 % of heap was reclaimed in all the JVMs.
> > >> > Case 2: Used 2 weblogic domains. My application was deployed on 1
> > >> domain.
> > >> On
> > >> > the other, I deployed the entire 5 million part index as one shard.
> > >> After
> > >> > multiple users used the system and doing a gorce GC, around 76 % of
> > the
> > >> heap
> > >> > was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM
> > where
> > >> my
> > >> > application was running. This result further convinces me that my
> > >> > application can be absolved of holding on to memory resources.
> > >> >
> > >> > I am not sure how to interpret these results ? For searching, I am
> > >> using
> > >> > Without Shards : EmbeddedSolrServer
> > >> > With Shards :CommonsHttpSolrServer
> > >> > In terms of Solr objects this is what differs in my code between
> > normal
> > >> > search and shards search (distributed search)
> > >> >
> > >> > After looking at Case 1, I thought that the CommonsHttpSolrServer
> was
> > >> more
> > >> > memory efficient but Case 2 proved me wrong. Or could there still be
> > >> memory
> > >> > leaks in my application ? Any thoughts, suggestions would be
> welcome.
> > >> >
> > >> > Regards
> > >> > Rahul
> > >>
> > >>
> > >
> > >
> >
> > --
> > View this message in context:
> >
>
> http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp248023
> 80p25018165.html
> >  Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >
>
>
>

RE: JVM Heap utilization & Memory leaks with Solr

Posted by Fuad Efendi <fu...@efendi.ca>.
Hi Rahul,

JRockit could be used at least in a test environment to monitor JVM (and
troubleshoot SOLR, licensed for-free for developers!); they have even
Eclipse plugin now, and it is licensed by Oracle (BEA)... But, of course, in
large companies test environment is in hands of testers :)


But... 3700 fields will create (over time) 3700 arrays  each of size
5,000,000!!! Even if most of fields are empty for most of documents...
Applicable to non-tokenized single-valued non-boolean fields only, Lucene
internals, FieldCache... and it won't be GC-collected after user log-off...
prefer dedicated box for SOLR.

-Fuad


-----Original Message-----
From: Rahul R [mailto:rahul.solr@gmail.com] 
Sent: August-19-09 6:19 AM
To: solr-user@lucene.apache.org
Subject: Re: JVM Heap utilization & Memory leaks with Solr

Fuad,
We have around 5 million documents and around 3700 fields. All documents
will not have values for all the fields.... JRockit is not approved for use
within my organization. But thanks for the info anyway.

Regards
Rahul

On Tue, Aug 18, 2009 at 9:41 AM, Funtick <fu...@efendi.ca> wrote:

>
> BTW, you should really prefer JRockit which really rocks!!!
>
> "Mission Control" has necessary toolongs; and JRockit produces _nice_
> exception stacktrace (explaining almost everything) in case of even OOM
> which SUN JVN still fails to produce.
>
>
> SolrServlet still catches "Throwable":
>
>    } catch (Throwable e) {
>      SolrException.log(log,e);
>      sendErr(500, SolrException.toStr(e), request, response);
>    } finally {
>
>
>
>
>
> Rahul R wrote:
> >
> > Otis,
> > Thank you for your response. I know there are a few variables here but
> the
> > difference in memory utilization with and without shards somehow leads
me
> > to
> > believe that the leak could be within Solr.
> >
> > I tried using a profiling tool - Yourkit. The trial version was free for
> > 15
> > days. But I couldn't find anything of significance.
> >
> > Regards
> > Rahul
> >
> >
> > On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic
> > <otis_gospodnetic@yahoo.com
> >> wrote:
> >
> >> Hi Rahul,
> >>
> >> A) There are no known (to me) memory leaks.
> >> I think there are too many variables for a person to tell you what
> >> exactly
> >> is happening, plus you are dealing with the JVM here. :)
> >>
> >> Try jmap -histo:live PID-HERE | less and see what's using your memory.
> >>
> >> Otis
> >> --
> >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> >>
> >>
> >>
> >> ----- Original Message ----
> >> > From: Rahul R <ra...@gmail.com>
> >> > To: solr-user@lucene.apache.org
> >> > Sent: Tuesday, August 4, 2009 1:09:06 AM
> >> > Subject: JVM Heap utilization & Memory leaks with Solr
> >> >
> >> > I am trying to track memory utilization with my Application that uses
> >> Solr.
> >> > Details of the setup :
> >> > -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
> >> > - Hardware : 12 CPU, 24 GB RAM
> >> >
> >> > For testing during PSR I am using a smaller subset of the actual data
> >> that I
> >> > want to work with. Details of this smaller sub-set :
> >> > - 5 million records, 4.5 GB index size
> >> >
> >> > Observations during PSR:
> >> > A) I have allocated 3.2 GB for the JVM(s) that I used. After all
users
> >> > logout and doing a force GC, only 60 % of the heap is reclaimed. As
> >> part
> >> of
> >> > the logout process I am invalidating the HttpSession and doing a
> >> close()
> >> on
> >> > CoreContainer. From my application's side, I don't believe I am
> holding
> >> on
> >> > to any resource. I wanted to know if there are known issues
> surrounding
> >> > memory leaks with Solr ?
> >> > B) To further test this, I tried deploying with shards. 3.2 GB was
> >> allocated
> >> > to each JVM. All JVMs had 96 % free heap space after start up. I got
> >> varying
> >> > results with this.
> >> > Case 1 : Used 6 weblogic domains. My application was deployed one 1
> >> domain.
> >> > I split the 5 million index into 5 parts of 1 million each and used
> >> them
> >> as
> >> > shards. After multiple users used the system and doing a force GC,
> >> around
> >> 94
> >> > - 96 % of heap was reclaimed in all the JVMs.
> >> > Case 2: Used 2 weblogic domains. My application was deployed on 1
> >> domain.
> >> On
> >> > the other, I deployed the entire 5 million part index as one shard.
> >> After
> >> > multiple users used the system and doing a gorce GC, around 76 % of
> the
> >> heap
> >> > was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM
> where
> >> my
> >> > application was running. This result further convinces me that my
> >> > application can be absolved of holding on to memory resources.
> >> >
> >> > I am not sure how to interpret these results ? For searching, I am
> >> using
> >> > Without Shards : EmbeddedSolrServer
> >> > With Shards :CommonsHttpSolrServer
> >> > In terms of Solr objects this is what differs in my code between
> normal
> >> > search and shards search (distributed search)
> >> >
> >> > After looking at Case 1, I thought that the CommonsHttpSolrServer was
> >> more
> >> > memory efficient but Case 2 proved me wrong. Or could there still be
> >> memory
> >> > leaks in my application ? Any thoughts, suggestions would be welcome.
> >> >
> >> > Regards
> >> > Rahul
> >>
> >>
> >
> >
>
> --
> View this message in context:
>
http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp248023
80p25018165.html
>  Sent from the Solr - User mailing list archive at Nabble.com.
>
>



Re: JVM Heap utilization & Memory leaks with Solr

Posted by Rahul R <ra...@gmail.com>.
Fuad,
We have around 5 million documents and around 3700 fields. All documents
will not have values for all the fields.... JRockit is not approved for use
within my organization. But thanks for the info anyway.

Regards
Rahul

On Tue, Aug 18, 2009 at 9:41 AM, Funtick <fu...@efendi.ca> wrote:

>
> BTW, you should really prefer JRockit which really rocks!!!
>
> "Mission Control" has necessary toolongs; and JRockit produces _nice_
> exception stacktrace (explaining almost everything) in case of even OOM
> which SUN JVN still fails to produce.
>
>
> SolrServlet still catches "Throwable":
>
>    } catch (Throwable e) {
>      SolrException.log(log,e);
>      sendErr(500, SolrException.toStr(e), request, response);
>    } finally {
>
>
>
>
>
> Rahul R wrote:
> >
> > Otis,
> > Thank you for your response. I know there are a few variables here but
> the
> > difference in memory utilization with and without shards somehow leads me
> > to
> > believe that the leak could be within Solr.
> >
> > I tried using a profiling tool - Yourkit. The trial version was free for
> > 15
> > days. But I couldn't find anything of significance.
> >
> > Regards
> > Rahul
> >
> >
> > On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic
> > <otis_gospodnetic@yahoo.com
> >> wrote:
> >
> >> Hi Rahul,
> >>
> >> A) There are no known (to me) memory leaks.
> >> I think there are too many variables for a person to tell you what
> >> exactly
> >> is happening, plus you are dealing with the JVM here. :)
> >>
> >> Try jmap -histo:live PID-HERE | less and see what's using your memory.
> >>
> >> Otis
> >> --
> >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> >>
> >>
> >>
> >> ----- Original Message ----
> >> > From: Rahul R <ra...@gmail.com>
> >> > To: solr-user@lucene.apache.org
> >> > Sent: Tuesday, August 4, 2009 1:09:06 AM
> >> > Subject: JVM Heap utilization & Memory leaks with Solr
> >> >
> >> > I am trying to track memory utilization with my Application that uses
> >> Solr.
> >> > Details of the setup :
> >> > -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
> >> > - Hardware : 12 CPU, 24 GB RAM
> >> >
> >> > For testing during PSR I am using a smaller subset of the actual data
> >> that I
> >> > want to work with. Details of this smaller sub-set :
> >> > - 5 million records, 4.5 GB index size
> >> >
> >> > Observations during PSR:
> >> > A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
> >> > logout and doing a force GC, only 60 % of the heap is reclaimed. As
> >> part
> >> of
> >> > the logout process I am invalidating the HttpSession and doing a
> >> close()
> >> on
> >> > CoreContainer. From my application's side, I don't believe I am
> holding
> >> on
> >> > to any resource. I wanted to know if there are known issues
> surrounding
> >> > memory leaks with Solr ?
> >> > B) To further test this, I tried deploying with shards. 3.2 GB was
> >> allocated
> >> > to each JVM. All JVMs had 96 % free heap space after start up. I got
> >> varying
> >> > results with this.
> >> > Case 1 : Used 6 weblogic domains. My application was deployed one 1
> >> domain.
> >> > I split the 5 million index into 5 parts of 1 million each and used
> >> them
> >> as
> >> > shards. After multiple users used the system and doing a force GC,
> >> around
> >> 94
> >> > - 96 % of heap was reclaimed in all the JVMs.
> >> > Case 2: Used 2 weblogic domains. My application was deployed on 1
> >> domain.
> >> On
> >> > the other, I deployed the entire 5 million part index as one shard.
> >> After
> >> > multiple users used the system and doing a gorce GC, around 76 % of
> the
> >> heap
> >> > was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM
> where
> >> my
> >> > application was running. This result further convinces me that my
> >> > application can be absolved of holding on to memory resources.
> >> >
> >> > I am not sure how to interpret these results ? For searching, I am
> >> using
> >> > Without Shards : EmbeddedSolrServer
> >> > With Shards :CommonsHttpSolrServer
> >> > In terms of Solr objects this is what differs in my code between
> normal
> >> > search and shards search (distributed search)
> >> >
> >> > After looking at Case 1, I thought that the CommonsHttpSolrServer was
> >> more
> >> > memory efficient but Case 2 proved me wrong. Or could there still be
> >> memory
> >> > leaks in my application ? Any thoughts, suggestions would be welcome.
> >> >
> >> > Regards
> >> > Rahul
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp24802380p25018165.html
>  Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: JVM Heap utilization & Memory leaks with Solr

Posted by Funtick <fu...@efendi.ca>.
BTW, you should really prefer JRockit which really rocks!!!

"Mission Control" has necessary toolongs; and JRockit produces _nice_
exception stacktrace (explaining almost everything) in case of even OOM
which SUN JVN still fails to produce.


SolrServlet still catches "Throwable":

    } catch (Throwable e) {
      SolrException.log(log,e);
      sendErr(500, SolrException.toStr(e), request, response);
    } finally {





Rahul R wrote:
> 
> Otis,
> Thank you for your response. I know there are a few variables here but the
> difference in memory utilization with and without shards somehow leads me
> to
> believe that the leak could be within Solr.
> 
> I tried using a profiling tool - Yourkit. The trial version was free for
> 15
> days. But I couldn't find anything of significance.
> 
> Regards
> Rahul
> 
> 
> On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic
> <otis_gospodnetic@yahoo.com
>> wrote:
> 
>> Hi Rahul,
>>
>> A) There are no known (to me) memory leaks.
>> I think there are too many variables for a person to tell you what
>> exactly
>> is happening, plus you are dealing with the JVM here. :)
>>
>> Try jmap -histo:live PID-HERE | less and see what's using your memory.
>>
>> Otis
>> --
>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>
>>
>>
>> ----- Original Message ----
>> > From: Rahul R <ra...@gmail.com>
>> > To: solr-user@lucene.apache.org
>> > Sent: Tuesday, August 4, 2009 1:09:06 AM
>> > Subject: JVM Heap utilization & Memory leaks with Solr
>> >
>> > I am trying to track memory utilization with my Application that uses
>> Solr.
>> > Details of the setup :
>> > -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
>> > - Hardware : 12 CPU, 24 GB RAM
>> >
>> > For testing during PSR I am using a smaller subset of the actual data
>> that I
>> > want to work with. Details of this smaller sub-set :
>> > - 5 million records, 4.5 GB index size
>> >
>> > Observations during PSR:
>> > A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
>> > logout and doing a force GC, only 60 % of the heap is reclaimed. As
>> part
>> of
>> > the logout process I am invalidating the HttpSession and doing a
>> close()
>> on
>> > CoreContainer. From my application's side, I don't believe I am holding
>> on
>> > to any resource. I wanted to know if there are known issues surrounding
>> > memory leaks with Solr ?
>> > B) To further test this, I tried deploying with shards. 3.2 GB was
>> allocated
>> > to each JVM. All JVMs had 96 % free heap space after start up. I got
>> varying
>> > results with this.
>> > Case 1 : Used 6 weblogic domains. My application was deployed one 1
>> domain.
>> > I split the 5 million index into 5 parts of 1 million each and used
>> them
>> as
>> > shards. After multiple users used the system and doing a force GC,
>> around
>> 94
>> > - 96 % of heap was reclaimed in all the JVMs.
>> > Case 2: Used 2 weblogic domains. My application was deployed on 1
>> domain.
>> On
>> > the other, I deployed the entire 5 million part index as one shard.
>> After
>> > multiple users used the system and doing a gorce GC, around 76 % of the
>> heap
>> > was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where
>> my
>> > application was running. This result further convinces me that my
>> > application can be absolved of holding on to memory resources.
>> >
>> > I am not sure how to interpret these results ? For searching, I am
>> using
>> > Without Shards : EmbeddedSolrServer
>> > With Shards :CommonsHttpSolrServer
>> > In terms of Solr objects this is what differs in my code between normal
>> > search and shards search (distributed search)
>> >
>> > After looking at Case 1, I thought that the CommonsHttpSolrServer was
>> more
>> > memory efficient but Case 2 proved me wrong. Or could there still be
>> memory
>> > leaks in my application ? Any thoughts, suggestions would be welcome.
>> >
>> > Regards
>> > Rahul
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp24802380p25018165.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: JVM Heap utilization & Memory leaks with Solr

Posted by Rahul R <ra...@gmail.com>.
Otis,
Thank you for your response. I know there are a few variables here but the
difference in memory utilization with and without shards somehow leads me to
believe that the leak could be within Solr.

I tried using a profiling tool - Yourkit. The trial version was free for 15
days. But I couldn't find anything of significance.

Regards
Rahul


On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic <otis_gospodnetic@yahoo.com
> wrote:

> Hi Rahul,
>
> A) There are no known (to me) memory leaks.
> I think there are too many variables for a person to tell you what exactly
> is happening, plus you are dealing with the JVM here. :)
>
> Try jmap -histo:live PID-HERE | less and see what's using your memory.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
> > From: Rahul R <ra...@gmail.com>
> > To: solr-user@lucene.apache.org
> > Sent: Tuesday, August 4, 2009 1:09:06 AM
> > Subject: JVM Heap utilization & Memory leaks with Solr
> >
> > I am trying to track memory utilization with my Application that uses
> Solr.
> > Details of the setup :
> > -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
> > - Hardware : 12 CPU, 24 GB RAM
> >
> > For testing during PSR I am using a smaller subset of the actual data
> that I
> > want to work with. Details of this smaller sub-set :
> > - 5 million records, 4.5 GB index size
> >
> > Observations during PSR:
> > A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
> > logout and doing a force GC, only 60 % of the heap is reclaimed. As part
> of
> > the logout process I am invalidating the HttpSession and doing a close()
> on
> > CoreContainer. From my application's side, I don't believe I am holding
> on
> > to any resource. I wanted to know if there are known issues surrounding
> > memory leaks with Solr ?
> > B) To further test this, I tried deploying with shards. 3.2 GB was
> allocated
> > to each JVM. All JVMs had 96 % free heap space after start up. I got
> varying
> > results with this.
> > Case 1 : Used 6 weblogic domains. My application was deployed one 1
> domain.
> > I split the 5 million index into 5 parts of 1 million each and used them
> as
> > shards. After multiple users used the system and doing a force GC, around
> 94
> > - 96 % of heap was reclaimed in all the JVMs.
> > Case 2: Used 2 weblogic domains. My application was deployed on 1 domain.
> On
> > the other, I deployed the entire 5 million part index as one shard. After
> > multiple users used the system and doing a gorce GC, around 76 % of the
> heap
> > was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where
> my
> > application was running. This result further convinces me that my
> > application can be absolved of holding on to memory resources.
> >
> > I am not sure how to interpret these results ? For searching, I am using
> > Without Shards : EmbeddedSolrServer
> > With Shards :CommonsHttpSolrServer
> > In terms of Solr objects this is what differs in my code between normal
> > search and shards search (distributed search)
> >
> > After looking at Case 1, I thought that the CommonsHttpSolrServer was
> more
> > memory efficient but Case 2 proved me wrong. Or could there still be
> memory
> > leaks in my application ? Any thoughts, suggestions would be welcome.
> >
> > Regards
> > Rahul
>
>

Re: JVM Heap utilization & Memory leaks with Solr

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Rahul,

A) There are no known (to me) memory leaks.
I think there are too many variables for a person to tell you what exactly is happening, plus you are dealing with the JVM here. :)

Try jmap -histo:live PID-HERE | less and see what's using your memory.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Rahul R <ra...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, August 4, 2009 1:09:06 AM
> Subject: JVM Heap utilization & Memory leaks with Solr
> 
> I am trying to track memory utilization with my Application that uses Solr.
> Details of the setup :
> -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
> - Hardware : 12 CPU, 24 GB RAM
> 
> For testing during PSR I am using a smaller subset of the actual data that I
> want to work with. Details of this smaller sub-set :
> - 5 million records, 4.5 GB index size
> 
> Observations during PSR:
> A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
> logout and doing a force GC, only 60 % of the heap is reclaimed. As part of
> the logout process I am invalidating the HttpSession and doing a close() on
> CoreContainer. From my application's side, I don't believe I am holding on
> to any resource. I wanted to know if there are known issues surrounding
> memory leaks with Solr ?
> B) To further test this, I tried deploying with shards. 3.2 GB was allocated
> to each JVM. All JVMs had 96 % free heap space after start up. I got varying
> results with this.
> Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain.
> I split the 5 million index into 5 parts of 1 million each and used them as
> shards. After multiple users used the system and doing a force GC, around 94
> - 96 % of heap was reclaimed in all the JVMs.
> Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On
> the other, I deployed the entire 5 million part index as one shard. After
> multiple users used the system and doing a gorce GC, around 76 % of the heap
> was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my
> application was running. This result further convinces me that my
> application can be absolved of holding on to memory resources.
> 
> I am not sure how to interpret these results ? For searching, I am using
> Without Shards : EmbeddedSolrServer
> With Shards :CommonsHttpSolrServer
> In terms of Solr objects this is what differs in my code between normal
> search and shards search (distributed search)
> 
> After looking at Case 1, I thought that the CommonsHttpSolrServer was more
> memory efficient but Case 2 proved me wrong. Or could there still be memory
> leaks in my application ? Any thoughts, suggestions would be welcome.
> 
> Regards
> Rahul