You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Geeta Subramanian <gs...@commvault.com> on 2011/03/17 17:12:46 UTC

memory not getting released in tomcat after pushing large documents

Hi,

I am very new to SOLR and facing a lot of issues when using SOLR to push large documents.
I have solr running in tomcat. I have allocated about 4gb memory (-Xmx) but I am pushing about twenty five 100 mb documents and gives heap space and fails.

Also I tried pushing just 1 document. It went thru successfully, but the tomcat memory does not come down. It consumes about a gig memory for just one 100 mb document and does not release it.

Please let me know if I am making any mistake in configuration/ or set up.

Here is the stack trace:
SEVERE: java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOf(Arrays.java:2882)
	at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
	at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
	at java.lang.StringBuffer.append(StringBuffer.java:306)
	at java.io.StringWriter.write(StringWriter.java:77)
	at com.sun.org.apache.xml.internal.serializer.ToStream.processDirty(ToStream.java:1570)
	at com.sun.org.apache.xml.internal.serializer.ToStream.characters(ToStream.java:1488)
	at com.sun.org.apache.xml.internal.serializer.ToHTMLStream.characters(ToHTMLStream.java:1529)
	at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerHandlerImpl.characters(TransformerHandlerImpl.java:168)
	at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:124)
	at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:153)
	at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:124)
	at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:124)
	at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:39)
	at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:61)
	at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:113)
	at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:151)
	at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:175)
	at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:144)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:142)
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
	at com.commvault.solr.handler.extraction.CVExtractingDocumentLoader.load(CVExtractingDocumentLoader.java:349)
	at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
	at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:237)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
	at filters.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:122)


Thanks for help,
Geeta













******************Legal Disclaimer***************************
"This communication may contain confidential and privileged material 
for the sole use of the intended recipient.  Any unauthorized review, 
use or distribution by others is strictly prohibited.  If you have 
received the message in error, please advise the sender by reply 
email and delete the message. Thank you."
****************************************************************

Re: memory not getting released in tomcat after pushing large documents

Posted by Darx Oman <da...@gmail.com>.
Hi guys
I'm facing a simillar porblem
and i find out it is caused by MS SQL that is running in the same machine
by just restarting MS SQL service, memory goes down.

Re: memory not getting released in tomcat after pushing large documents

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Thu, Mar 17, 2011 at 5:50 PM, Geeta Subramanian
<gs...@commvault.com> wrote:
> Here is the attached xml.
> In our xml, maxBufferedDocs is commented. I hope that's not causing any issue.
> The ramBufferSizeMB is 32Mb, will changing  this be of any use to me?

Nope... your index settings are fine.
Perhaps something in extracting request handler or tika is holding onto memory.
Has anyone else experienced/reproduced this?

Geeta, can you open a JIRA issue?  If you're actually giving the JVM
4G of heap (is this a 64 bit JVM?), this looks like a bug somewhere.

-Yonik
http://lucidimagination.com

RE: memory not getting released in tomcat after pushing large documents

Posted by Geeta Subramanian <gs...@commvault.com>.
Hi Yonik,

Here is the attached xml.
In our xml, maxBufferedDocs is commented. I hope that's not causing any issue.
The ramBufferSizeMB is 32Mb, will changing  this be of any use to me?

Thanks a lot,
Geeta



-----Original Message-----
From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
Sent: 17 March, 2011 5:27 PM
To: Geeta Subramanian
Cc: solr-user@lucene.apache.org
Subject: Re: memory not getting released in tomcat after pushing large documents

On Thu, Mar 17, 2011 at 3:55 PM, Geeta Subramanian <gs...@commvault.com> wrote:
> Hi Yonik,
>
> I am not setting the ramBufferSizeMB or maxBufferedDocs params...
> DO I need to for Indexing?

No, the default settings that come with Solr should be fine.
You should verify that they have not been changed however.

An older solrconfig that used maxBufferedDocs could cause an OOM with large documents since it buffered a certain amount of documents instead a certain amount of RAM.

Perhaps post your solrconfig (or at least the sections related to index configuration).

-Yonik
http://lucidimagination.com


> Regards,
> Geeta
>
> -----Original Message-----
> From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik 
> Seeley
> Sent: 17 March, 2011 3:45 PM
> To: Geeta Subramanian
> Cc: solr-user@lucene.apache.org
> Subject: Re: memory not getting released in tomcat after pushing large 
> documents
>
> In your solrconfig.xml,
> Are you specifying ramBufferSizeMB or maxBufferedDocs?
>
> -Yonik
> http://lucidimagination.com
>
>
> On Thu, Mar 17, 2011 at 12:27 PM, Geeta Subramanian <gs...@commvault.com> wrote:
>> Hi,
>>
>>  Thanks for the reply.
>> I am sorry, the logs from where I posted does have a Custom Update Handler.
>>
>> But I have a local setup, which does not have a custome update handler, its as its downloaded from SOLR site, even that gives me heap space.
>>
>> at java.util.Arrays.copyOf(Unknown Source)
>>        at java.lang.AbstractStringBuilder.expandCapacity(Unknown
>> Source)
>>        at java.lang.AbstractStringBuilder.append(Unknown Source)
>>        at java.lang.StringBuilder.append(Unknown Source)
>>        at org.apache.solr.handler.extraction.Solrtik
>> ContentHandler.characters(SolrContentHandler.java:257)
>>        at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandler
>> D
>> ecorator.java:124)
>>        at
>> org.apache.tika.sax.SecureContentHandler.characters(SecureContentHand
>> l
>> er.java:153)
>>        at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandler
>> D
>> ecorator.java:124)
>>        at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandler
>> D
>> ecorator.java:124)
>>        at
>> org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.
>> j
>> ava:39)
>>        at
>> org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.jav
>> a
>> :61)
>>        at
>> org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:
>> 113)
>>        at
>> org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.
>> j
>> ava:151)
>>        at
>> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandle
>> r
>> .java:175)
>>        at
>> org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:144)
>>        at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:142
>> )
>>        at
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:9
>> 9
>> )
>>        at
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
>> 1
>> 2)
>>        at
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
>> a
>> ctingDocumentLoader.java:193)
>>        at
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co
>> n
>> tentStreamHandlerBase.java:54)
>>        at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
>> e
>> rBase.java:131)
>>        at
>> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
>> R
>> equest(RequestHandlers.java:237)
>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
>>        at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.
>> java:337)
>>
>>
>>
>> Also, in general, if I post 25 * 100 mb docs to solr, how much should be the ideal heap space set?
>> Also, I see that when I push a single document of 100 mb, in task manager I see that about 900 mb memory is been used up, and some subsequent push keeps the memory about 900mb, so at what point there can be OOM crash?
>>
>> When I ran the YourKit Profiler, I saw that around 1 gig of memory was just consumed by char[] , String [].
>> How can I find out who is creating these(is it SOLR or TIKA) and free up these objects?
>>
>>
>> Thank you so much for your time and help,
>>
>>
>>
>> Regards,
>> Geeta
>>
>>
>>
>> -----Original Message-----
>> From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik 
>> Seeley
>> Sent: 17 March, 2011 12:21 PM
>> To: solr-user@lucene.apache.org
>> Cc: Geeta Subramanian
>> Subject: Re: memory not getting released in tomcat after pushing 
>> large documents
>>
>> On Thu, Mar 17, 2011 at 12:12 PM, Geeta Subramanian <gs...@commvault.com> wrote:
>>>        at
>>> com.commvault.solr.handler.extraction.CVExtractingDocumentLoader.loa
>>> d
>>> (
>>> CVExtractingDocumentLoader.java:349)
>>
>> Looks like you're using a custom update handler.  Perhaps that's accidentally hanging onto memory?
>>
>> -Yonik
>> http://lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ******************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged material 
>> for the sole use of the intended recipient.  Any unauthorized review, 
>> use or distribution by others is strictly prohibited.  If you have 
>> received the message in error, please advise the sender by reply 
>> email and delete the message. Thank you."
>> ****************************************************************
>>
>
>
>
>
>
>
>
>
>
>
>
> ******************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material 
> for the sole use of the intended recipient.  Any unauthorized review, 
> use or distribution by others is strictly prohibited.  If you have 
> received the message in error, please advise the sender by reply email 
> and delete the message. Thank you."
> ****************************************************************
>












******************Legal Disclaimer***************************
"This communication may contain confidential and privileged material 
for the sole use of the intended recipient.  Any unauthorized review, 
use or distribution by others is strictly prohibited.  If you have 
received the message in error, please advise the sender by reply 
email and delete the message. Thank you."
****************************************************************

Re: memory not getting released in tomcat after pushing large documents

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Thu, Mar 17, 2011 at 3:55 PM, Geeta Subramanian
<gs...@commvault.com> wrote:
> Hi Yonik,
>
> I am not setting the ramBufferSizeMB or maxBufferedDocs params...
> DO I need to for Indexing?

No, the default settings that come with Solr should be fine.
You should verify that they have not been changed however.

An older solrconfig that used maxBufferedDocs could cause an OOM with
large documents since it buffered a certain amount of documents
instead a certain amount of RAM.

Perhaps post your solrconfig (or at least the sections related to
index configuration).

-Yonik
http://lucidimagination.com


> Regards,
> Geeta
>
> -----Original Message-----
> From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
> Sent: 17 March, 2011 3:45 PM
> To: Geeta Subramanian
> Cc: solr-user@lucene.apache.org
> Subject: Re: memory not getting released in tomcat after pushing large documents
>
> In your solrconfig.xml,
> Are you specifying ramBufferSizeMB or maxBufferedDocs?
>
> -Yonik
> http://lucidimagination.com
>
>
> On Thu, Mar 17, 2011 at 12:27 PM, Geeta Subramanian <gs...@commvault.com> wrote:
>> Hi,
>>
>>  Thanks for the reply.
>> I am sorry, the logs from where I posted does have a Custom Update Handler.
>>
>> But I have a local setup, which does not have a custome update handler, its as its downloaded from SOLR site, even that gives me heap space.
>>
>> at java.util.Arrays.copyOf(Unknown Source)
>>        at java.lang.AbstractStringBuilder.expandCapacity(Unknown
>> Source)
>>        at java.lang.AbstractStringBuilder.append(Unknown Source)
>>        at java.lang.StringBuilder.append(Unknown Source)
>>        at org.apache.solr.handler.extraction.Solrtik
>> ContentHandler.characters(SolrContentHandler.java:257)
>>        at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerD
>> ecorator.java:124)
>>        at
>> org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandl
>> er.java:153)
>>        at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerD
>> ecorator.java:124)
>>        at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerD
>> ecorator.java:124)
>>        at
>> org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.j
>> ava:39)
>>        at
>> org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java
>> :61)
>>        at
>> org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:
>> 113)
>>        at
>> org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.j
>> ava:151)
>>        at
>> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler
>> .java:175)
>>        at
>> org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:144)
>>        at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:142)
>>        at
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99
>> )
>>        at
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:11
>> 2)
>>        at
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extra
>> ctingDocumentLoader.java:193)
>>        at
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Con
>> tentStreamHandlerBase.java:54)
>>        at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandle
>> rBase.java:131)
>>        at
>> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleR
>> equest(RequestHandlers.java:237)
>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
>>        at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.
>> java:337)
>>
>>
>>
>> Also, in general, if I post 25 * 100 mb docs to solr, how much should be the ideal heap space set?
>> Also, I see that when I push a single document of 100 mb, in task manager I see that about 900 mb memory is been used up, and some subsequent push keeps the memory about 900mb, so at what point there can be OOM crash?
>>
>> When I ran the YourKit Profiler, I saw that around 1 gig of memory was just consumed by char[] , String [].
>> How can I find out who is creating these(is it SOLR or TIKA) and free up these objects?
>>
>>
>> Thank you so much for your time and help,
>>
>>
>>
>> Regards,
>> Geeta
>>
>>
>>
>> -----Original Message-----
>> From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik
>> Seeley
>> Sent: 17 March, 2011 12:21 PM
>> To: solr-user@lucene.apache.org
>> Cc: Geeta Subramanian
>> Subject: Re: memory not getting released in tomcat after pushing large
>> documents
>>
>> On Thu, Mar 17, 2011 at 12:12 PM, Geeta Subramanian <gs...@commvault.com> wrote:
>>>        at
>>> com.commvault.solr.handler.extraction.CVExtractingDocumentLoader.load
>>> (
>>> CVExtractingDocumentLoader.java:349)
>>
>> Looks like you're using a custom update handler.  Perhaps that's accidentally hanging onto memory?
>>
>> -Yonik
>> http://lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ******************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged material
>> for the sole use of the intended recipient.  Any unauthorized review,
>> use or distribution by others is strictly prohibited.  If you have
>> received the message in error, please advise the sender by reply email
>> and delete the message. Thank you."
>> ****************************************************************
>>
>
>
>
>
>
>
>
>
>
>
>
> ******************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material
> for the sole use of the intended recipient.  Any unauthorized review,
> use or distribution by others is strictly prohibited.  If you have
> received the message in error, please advise the sender by reply
> email and delete the message. Thank you."
> ****************************************************************
>

RE: memory not getting released in tomcat after pushing large documents

Posted by Geeta Subramanian <gs...@commvault.com>.
Hi Yonik,

I am not setting the ramBufferSizeMB or maxBufferedDocs params...
DO I need to for Indexing?

Regards,
Geeta

-----Original Message-----
From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
Sent: 17 March, 2011 3:45 PM
To: Geeta Subramanian
Cc: solr-user@lucene.apache.org
Subject: Re: memory not getting released in tomcat after pushing large documents

In your solrconfig.xml,
Are you specifying ramBufferSizeMB or maxBufferedDocs?

-Yonik
http://lucidimagination.com


On Thu, Mar 17, 2011 at 12:27 PM, Geeta Subramanian <gs...@commvault.com> wrote:
> Hi,
>
>  Thanks for the reply.
> I am sorry, the logs from where I posted does have a Custom Update Handler.
>
> But I have a local setup, which does not have a custome update handler, its as its downloaded from SOLR site, even that gives me heap space.
>
> at java.util.Arrays.copyOf(Unknown Source)
>        at java.lang.AbstractStringBuilder.expandCapacity(Unknown 
> Source)
>        at java.lang.AbstractStringBuilder.append(Unknown Source)
>        at java.lang.StringBuilder.append(Unknown Source)
>        at org.apache.solr.handler.extraction.Solrtik   
> ContentHandler.characters(SolrContentHandler.java:257)
>        at 
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerD
> ecorator.java:124)
>        at 
> org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandl
> er.java:153)
>        at 
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerD
> ecorator.java:124)
>        at 
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerD
> ecorator.java:124)
>        at 
> org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.j
> ava:39)
>        at 
> org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java
> :61)
>        at 
> org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:
> 113)
>        at 
> org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.j
> ava:151)
>        at 
> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler
> .java:175)
>        at 
> org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:144)
>        at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:142)
>        at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99
> )
>        at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:11
> 2)
>        at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extra
> ctingDocumentLoader.java:193)
>        at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Con
> tentStreamHandlerBase.java:54)
>        at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandle
> rBase.java:131)
>        at 
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleR
> equest(RequestHandlers.java:237)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
>        at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.
> java:337)
>
>
>
> Also, in general, if I post 25 * 100 mb docs to solr, how much should be the ideal heap space set?
> Also, I see that when I push a single document of 100 mb, in task manager I see that about 900 mb memory is been used up, and some subsequent push keeps the memory about 900mb, so at what point there can be OOM crash?
>
> When I ran the YourKit Profiler, I saw that around 1 gig of memory was just consumed by char[] , String [].
> How can I find out who is creating these(is it SOLR or TIKA) and free up these objects?
>
>
> Thank you so much for your time and help,
>
>
>
> Regards,
> Geeta
>
>
>
> -----Original Message-----
> From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik 
> Seeley
> Sent: 17 March, 2011 12:21 PM
> To: solr-user@lucene.apache.org
> Cc: Geeta Subramanian
> Subject: Re: memory not getting released in tomcat after pushing large 
> documents
>
> On Thu, Mar 17, 2011 at 12:12 PM, Geeta Subramanian <gs...@commvault.com> wrote:
>>        at
>> com.commvault.solr.handler.extraction.CVExtractingDocumentLoader.load
>> (
>> CVExtractingDocumentLoader.java:349)
>
> Looks like you're using a custom update handler.  Perhaps that's accidentally hanging onto memory?
>
> -Yonik
> http://lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>
>
>
> ******************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material 
> for the sole use of the intended recipient.  Any unauthorized review, 
> use or distribution by others is strictly prohibited.  If you have 
> received the message in error, please advise the sender by reply email 
> and delete the message. Thank you."
> ****************************************************************
>











******************Legal Disclaimer***************************
"This communication may contain confidential and privileged material 
for the sole use of the intended recipient.  Any unauthorized review, 
use or distribution by others is strictly prohibited.  If you have 
received the message in error, please advise the sender by reply 
email and delete the message. Thank you."
****************************************************************

Re: memory not getting released in tomcat after pushing large documents

Posted by Yonik Seeley <yo...@lucidimagination.com>.
In your solrconfig.xml,
Are you specifying ramBufferSizeMB or maxBufferedDocs?

-Yonik
http://lucidimagination.com


On Thu, Mar 17, 2011 at 12:27 PM, Geeta Subramanian
<gs...@commvault.com> wrote:
> Hi,
>
>  Thanks for the reply.
> I am sorry, the logs from where I posted does have a Custom Update Handler.
>
> But I have a local setup, which does not have a custome update handler, its as its downloaded from SOLR site, even that gives me heap space.
>
> at java.util.Arrays.copyOf(Unknown Source)
>        at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
>        at java.lang.AbstractStringBuilder.append(Unknown Source)
>        at java.lang.StringBuilder.append(Unknown Source)
>        at org.apache.solr.handler.extraction.Solrtik   ContentHandler.characters(SolrContentHandler.java:257)
>        at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:124)
>        at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:153)
>        at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:124)
>        at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:124)
>        at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:39)
>        at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:61)
>        at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:113)
>        at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:151)
>        at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:175)
>        at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:144)
>        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:142)
>        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
>        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:112)
>        at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:193)
>        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>        at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:237)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
>        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
>
>
>
> Also, in general, if I post 25 * 100 mb docs to solr, how much should be the ideal heap space set?
> Also, I see that when I push a single document of 100 mb, in task manager I see that about 900 mb memory is been used up, and some subsequent push keeps the memory about 900mb, so at what point there can be OOM crash?
>
> When I ran the YourKit Profiler, I saw that around 1 gig of memory was just consumed by char[] , String [].
> How can I find out who is creating these(is it SOLR or TIKA) and free up these objects?
>
>
> Thank you so much for your time and help,
>
>
>
> Regards,
> Geeta
>
>
>
> -----Original Message-----
> From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
> Sent: 17 March, 2011 12:21 PM
> To: solr-user@lucene.apache.org
> Cc: Geeta Subramanian
> Subject: Re: memory not getting released in tomcat after pushing large documents
>
> On Thu, Mar 17, 2011 at 12:12 PM, Geeta Subramanian <gs...@commvault.com> wrote:
>>        at
>> com.commvault.solr.handler.extraction.CVExtractingDocumentLoader.load(
>> CVExtractingDocumentLoader.java:349)
>
> Looks like you're using a custom update handler.  Perhaps that's accidentally hanging onto memory?
>
> -Yonik
> http://lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>
>
>
> ******************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material
> for the sole use of the intended recipient.  Any unauthorized review,
> use or distribution by others is strictly prohibited.  If you have
> received the message in error, please advise the sender by reply
> email and delete the message. Thank you."
> ****************************************************************
>

RE: memory not getting released in tomcat after pushing large documents

Posted by Geeta Subramanian <gs...@commvault.com>.
Hi,

 Thanks for the reply.
I am sorry, the logs from where I posted does have a Custom Update Handler.

But I have a local setup, which does not have a custome update handler, its as its downloaded from SOLR site, even that gives me heap space.

at java.util.Arrays.copyOf(Unknown Source)  	
	at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)  	
	at java.lang.AbstractStringBuilder.append(Unknown Source)  	
	at java.lang.StringBuilder.append(Unknown Source)  	
	at org.apache.solr.handler.extraction.Solrtik	ContentHandler.characters(SolrContentHandler.java:257)  	
	at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:124)  	
	at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:153)  	
	at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:124)  	
	at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:124)  	
	at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:39)  	
	at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:61)  	
	at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:113)  	
	at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:151)  	
	at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:175)  	
	at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:144)  	
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:142)  	
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)  	
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:112)  	
	at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:193)  	
	at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)  	
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)  	
	at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:237)  	
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)  	
	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)  	



Also, in general, if I post 25 * 100 mb docs to solr, how much should be the ideal heap space set?
Also, I see that when I push a single document of 100 mb, in task manager I see that about 900 mb memory is been used up, and some subsequent push keeps the memory about 900mb, so at what point there can be OOM crash?

When I ran the YourKit Profiler, I saw that around 1 gig of memory was just consumed by char[] , String []. 
How can I find out who is creating these(is it SOLR or TIKA) and free up these objects?


Thank you so much for your time and help,



Regards,
Geeta



-----Original Message-----
From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
Sent: 17 March, 2011 12:21 PM
To: solr-user@lucene.apache.org
Cc: Geeta Subramanian
Subject: Re: memory not getting released in tomcat after pushing large documents

On Thu, Mar 17, 2011 at 12:12 PM, Geeta Subramanian <gs...@commvault.com> wrote:
>        at 
> com.commvault.solr.handler.extraction.CVExtractingDocumentLoader.load(
> CVExtractingDocumentLoader.java:349)

Looks like you're using a custom update handler.  Perhaps that's accidentally hanging onto memory?

-Yonik
http://lucidimagination.com













******************Legal Disclaimer***************************
"This communication may contain confidential and privileged material 
for the sole use of the intended recipient.  Any unauthorized review, 
use or distribution by others is strictly prohibited.  If you have 
received the message in error, please advise the sender by reply 
email and delete the message. Thank you."
****************************************************************

Re: memory not getting released in tomcat after pushing large documents

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Thu, Mar 17, 2011 at 12:12 PM, Geeta Subramanian
<gs...@commvault.com> wrote:
>        at com.commvault.solr.handler.extraction.CVExtractingDocumentLoader.load(CVExtractingDocumentLoader.java:349)

Looks like you're using a custom update handler.  Perhaps that's
accidentally hanging onto memory?

-Yonik
http://lucidimagination.com

Re: memory not getting released in tomcat after pushing large documents

Posted by Markus Jelsma <ma...@openindex.io>.
Hi,

25*100MB=2.5GB will most likely fail with just 4GB of heap space. But 
consecutive single `pushes` as you call it, of 25MB documents should work 
fine. Heap memory will only drop after the garbage collector comes along.

Cheers,

On Thursday 17 March 2011 17:12:46 Geeta Subramanian wrote:
> Hi,
> 
> I am very new to SOLR and facing a lot of issues when using SOLR to push
> large documents. I have solr running in tomcat. I have allocated about 4gb
> memory (-Xmx) but I am pushing about twenty five 100 mb documents and
> gives heap space and fails.
> 
> Also I tried pushing just 1 document. It went thru successfully, but the
> tomcat memory does not come down. It consumes about a gig memory for just
> one 100 mb document and does not release it.
> 
> Please let me know if I am making any mistake in configuration/ or set up.
> 
> Here is the stack trace:
> SEVERE: java.lang.OutOfMemoryError: Java heap space
> 	at java.util.Arrays.copyOf(Arrays.java:2882)
> 	at
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:
> 100) at
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515) at
> java.lang.StringBuffer.append(StringBuffer.java:306)
> 	at java.io.StringWriter.write(StringWriter.java:77)
> 	at
> com.sun.org.apache.xml.internal.serializer.ToStream.processDirty(ToStream.
> java:1570) at
> com.sun.org.apache.xml.internal.serializer.ToStream.characters(ToStream.ja
> va:1488) at
> com.sun.org.apache.xml.internal.serializer.ToHTMLStream.characters(ToHTMLS
> tream.java:1529) at
> com.sun.org.apache.xalan.internal.xsltc.trax.TransformerHandlerImpl.charac
> ters(TransformerHandlerImpl.java:168) at
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecor
> ator.java:124) at
> org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.j
> ava:153) at
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecor
> ator.java:124) at
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecor
> ator.java:124) at
> org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:
> 39) at
> org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:61)
> at
> org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:113)
> at
> org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:
> 151) at
> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.jav
> a:175) at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:144) at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:142) at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99) at
> com.commvault.solr.handler.extraction.CVExtractingDocumentLoader.load(CVEx
> tractingDocumentLoader.java:349) at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Content
> StreamHandlerBase.java:54) at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas
> e.java:131) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
> st(RequestHandlers.java:237) at
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
> 	at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
> :337) at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
> a:240) at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicati
> onFilterChain.java:235) at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilter
> Chain.java:206) at
> filters.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.jav
> a:122)
> 
> 
> Thanks for help,
> Geeta
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ******************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material
> for the sole use of the intended recipient.  Any unauthorized review,
> use or distribution by others is strictly prohibited.  If you have
> received the message in error, please advise the sender by reply
> email and delete the message. Thank you."
> ****************************************************************

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350