You are viewing a plain text version of this content. The canonical link for it is here.
Posted to slide-user@jakarta.apache.org by Aslam Bari <ia...@yahoo.co.in> on 2007/01/20 13:05:03 UTC

Big XML file indexing out of Memory error

Dear all,

I m sending Big XML files to server about 30 - 50MB size. I have configured
SimpleXmlExtractor. When this extractor start parsing, Two things happend:-

1. Either Out Of Memory error occured.
2. Or The Request get Timed-Out.

I see the Code of SimpleXMLExtractor, it is using SAXBuilder with JDOM. Can
we use something else to speed up the process.

Thanks...
-- 
View this message in context: http://www.nabble.com/Big-XML-file-indexing-out-of-Memory-error-tf3044992.html#a8464267
Sent from the Jakarta Slide - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


Re: Big XML file indexing out of Memory error

Posted by Marco Ferretti <ma...@gmail.com>.
On 1/22/07, Miguel Figueiredo <mf...@maisis.pt> wrote:
>
>  Hello Aslam,
>
>  Not sure if you already tried this, but set this options to the JVM:
>
>  -Xms128m -Xmx768m -XX:MaxPermSize=256m
>
>  Hope this helps,
>  Miguel Figueiredo
>

Hi,
I can confirm this.
SAXBuilder uses, by default, JAXP which is documented to have this
problem (http://java.sun.com/webservices/docs/1.6/jaxb/ReleaseNotes.html)

in particular :

Compilations of sufficiently large and/or complex schemas may fail
with java.lang.StackOverflowError. Try working around this problem by
increasing the native stack size. For example, to increase the stack
size to 2 megabytes:

    java -Xss2m

If you run out of memory, try increasing the maximum memory size. For
example, to increase the maximum memory to 128 megabytes:

    java -Xmx128m

---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


Re: A question about Slide/lucene

Posted by Aslam Bari <ia...@yahoo.co.in>.
I don't know exactly, but a long alternate can be used.

Make a program in which run a loop, Get all files one by one from slide and
again put them just, slide will reindex them. Some thing like this
pseudocode:-

While (fileexist)
{
           file = GetFileFromServer(//GET request);
           PutFileToServer(//PUT request);
}




Bruno Dorel wrote:
> 
> 
> 
> Is it possible to re-index a SLIDE repository :
> I mean : du to several problems  (normal in my case of use)  I need to 
> re-index my Slide content in a "rescue"  procedure to synchronize the 
> content (SLIDE) and the content index  (lucene) is it possible ? is this 
> re-index can be périodiq ?
> 
> Regards
> 
> B DOREL
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: slide-user-help@jakarta.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Big-XML-file-indexing-out-of-Memory-error-tf3044992.html#a8554982
Sent from the Jakarta Slide - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


A question about Slide/lucene

Posted by Bruno Dorel <bd...@wanadoo.fr>.

Is it possible to re-index a SLIDE repository :
I mean : du to several problems  (normal in my case of use)  I need to 
re-index my Slide content in a "rescue"  procedure to synchronize the 
content (SLIDE) and the content index  (lucene) is it possible ? is this 
re-index can be périodiq ?

Regards

B DOREL





---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


Re: Big XML file indexing out of Memory error

Posted by Marco Ferretti <ma...@gmail.com>.
On 1/23/07, Aslam Bari <ia...@yahoo.co.in> wrote:
>
> Hi Macro,
> I did the same. I have some doubt to clear. Plz guide me on below topics,
> becoz so far i don't know so much about slide code.
>
> 1. I m running tomcat by command line startup.bat and parameters like
>
>             startup.bat -Xms256m -Xmx512m -XX:MaxPermSize=256m -Xss2m
>
> Is it the right way to use these parameters or we have to congigure them
> somewhere else.

I *think* ( I usually work under linux ... when in windows I use the
service ) you have to double the minus so your cmd looks like
startup.bat --Xms256m --Xmx512m --XX:MaxPermSize=256m --Xss2m

>
>
> 2. I m using ByteArrayOutputStream and ByteArrayInputStream for extrating
> contentes of XML file. I think genereally the outofmemory error comes here.
> Is this bytearray.... is not suffieciet or i should use something else for
> extracting.
>
I think that the classes you're using are fine. Further discussion on
xml parsing ( we don't want to reinvent the wheel here ) can be found
@ sun : http://java.sun.com/developer/technicalArticles/xml/JavaTechandXML_part3/
and @ ibm http://www-128.ibm.com/developerworks/library/wi-parsexml/
>
> 3. I have changed "sequential-mode" "fine-grain" to "full". Whats the
> difference between them and after change what will be the effect on slide
> working.
>

Again, this is explained here
http://wiki.apache.org/jakarta-slide/ConcurrencyIssues

>
> 3. I have configure custom properties in Domain.xml like this. It is the
> right way or i have used wrong syntax.
>
> <propertiesindexer
> classname="org.apache.slide.index.lucene.LucenePropertiesIndexer">
>                 <parameter name="indexpath">${filespath}store/index_meta</parameter>
>                 <parameter name="asynchron">false</parameter>
>                 <parameter name="priority">3</parameter>
>
>         <configuration name="indexed-properties">
>
>                                 <!-- Custom Property -->
>         <property name="book" namespace="DAV:">
>                               <string/><is-defined/>
>                         </property>
>         <property name="size" namespace="DAV:">
>                       <integer/><is-defined/>
>                         </property>
>
>                 <property name="datepub" namespace="DAV:">
>                 <date/><is-defined/>
>                         </property>
>                 <property name="datemod" namespace="DAV:">
>                 <date/><is-defined/>
>                         </property>
>
>         </configuration>
>         </propertiesindexer>
>

Looks ok . Again the wiki helps you :
http://wiki.apache.org/jakarta-slide/DaslConfiguration

>
>
> Thanks...
>

NP !

---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


Re: Big XML file indexing out of Memory error

Posted by Aslam Bari <ia...@yahoo.co.in>.
Hi Macro,
I did the same. I have some doubt to clear. Plz guide me on below topics,
becoz so far i don't know so much about slide code.

1. I m running tomcat by command line startup.bat and parameters like

            startup.bat -Xms256m -Xmx512m -XX:MaxPermSize=256m -Xss2m

Is it the right way to use these parameters or we have to congigure them
somewhere else.


2. I m using ByteArrayOutputStream and ByteArrayInputStream for extrating
contentes of XML file. I think genereally the outofmemory error comes here.
Is this bytearray.... is not suffieciet or i should use something else for
extracting.


3. I have changed "sequential-mode" "fine-grain" to "full". Whats the
difference between them and after change what will be the effect on slide
working.


3. I have configure custom properties in Domain.xml like this. It is the
right way or i have used wrong syntax.

<propertiesindexer
classname="org.apache.slide.index.lucene.LucenePropertiesIndexer"> 
		<parameter name="indexpath">${filespath}store/index_meta</parameter>
		<parameter name="asynchron">false</parameter> 
		<parameter name="priority">3</parameter> 

	<configuration name="indexed-properties">

				<!-- Custom Property -->
	<property name="book" namespace="DAV:">
                              <string/><is-defined/>
           		</property>
	<property name="size" namespace="DAV:">
	              <integer/><is-defined/>
            		</property>

            	<property name="datepub" namespace="DAV:">
		<date/><is-defined/>
            		</property>
            	<property name="datemod" namespace="DAV:">
		<date/><is-defined/>
            		</property>
            
	</configuration>
	</propertiesindexer>



Thanks...


Marco Ferretti wrote:
> 
>>
>> I have also heard rumour that slide has many memory leaks. How can i run
>> a
>> GC to clear garbage memory time to time.
> 
> System.gc();
> 
> althou I am not sure this will do the trick for you since the memory
> hole Miguel is referring to *shall* automatically call a gc when no
> more space available ... before throwing the exception
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: slide-user-help@jakarta.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Big-XML-file-indexing-out-of-Memory-error-tf3044992.html#a8520941
Sent from the Jakarta Slide - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


Re: Big XML file indexing out of Memory error

Posted by Marco Ferretti <ma...@gmail.com>.
>
> I have also heard rumour that slide has many memory leaks. How can i run a
> GC to clear garbage memory time to time.

System.gc();

althou I am not sure this will do the trick for you since the memory
hole Miguel is referring to *shall* automatically call a gc when no
more space available ... before throwing the exception

---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


RE: Big XML file indexing out of Memory error

Posted by Aslam Bari <ia...@yahoo.co.in>.
Hi Miguel,

I have set the options like this, is it right? I will again test and tell
you what happend.

-Xms512m -Xmx512m -XX:MaxPermSize=256m -Xss2m

I have also heard rumour that slide has many memory leaks. How can i run a
GC to clear garbage memory time to time.


Miguel Figueiredo wrote:
> 
> 
>  Hello Aslam,
> 
>  Not sure if you already tried this, but set this options to the JVM:
> 
>  -Xms128m -Xmx768m -XX:MaxPermSize=256m
> 
>  Hope this helps,
>  Miguel Figueiredo
> 
> -----Original Message-----
> From: Aslam Bari [mailto:iamaslamok@yahoo.co.in] 
> Sent: sábado, 20 de Janeiro de 2007 12:05
> To: slide-user@jakarta.apache.org
> Subject: Big XML file indexing out of Memory error
> 
> 
> Dear all,
> 
> I m sending Big XML files to server about 30 - 50MB size. I have
> configured
> SimpleXmlExtractor. When this extractor start parsing, Two things
> happend:-
> 
> 1. Either Out Of Memory error occured.
> 2. Or The Request get Timed-Out.
> 
> I see the Code of SimpleXMLExtractor, it is using SAXBuilder with JDOM.
> Can
> we use something else to speed up the process.
> 
> Thanks...
> -- 
> View this message in context:
> http://www.nabble.com/Big-XML-file-indexing-out-of-Memory-error-tf3044992.ht
> ml#a8464267
> Sent from the Jakarta Slide - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: slide-user-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: slide-user-help@jakarta.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Big-XML-file-indexing-out-of-Memory-error-tf3044992.html#a8515240
Sent from the Jakarta Slide - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


RE: Big XML file indexing out of Memory error

Posted by Miguel Figueiredo <mf...@maisis.pt>.
 Hello Aslam,

 Not sure if you already tried this, but set this options to the JVM:

 -Xms128m -Xmx768m -XX:MaxPermSize=256m

 Hope this helps,
 Miguel Figueiredo

-----Original Message-----
From: Aslam Bari [mailto:iamaslamok@yahoo.co.in] 
Sent: sábado, 20 de Janeiro de 2007 12:05
To: slide-user@jakarta.apache.org
Subject: Big XML file indexing out of Memory error


Dear all,

I m sending Big XML files to server about 30 - 50MB size. I have configured
SimpleXmlExtractor. When this extractor start parsing, Two things happend:-

1. Either Out Of Memory error occured.
2. Or The Request get Timed-Out.

I see the Code of SimpleXMLExtractor, it is using SAXBuilder with JDOM. Can
we use something else to speed up the process.

Thanks...
-- 
View this message in context:
http://www.nabble.com/Big-XML-file-indexing-out-of-Memory-error-tf3044992.ht
ml#a8464267
Sent from the Jakarta Slide - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


Re: Big XML file indexing out of Memory error

Posted by Bruno Dorel <bd...@wanadoo.fr>.
Yes of course you can use a data_base to store Uri, links, locks ..... 
etc I started tests this morning using postgres (free RDBMS  data base)

hope I can send results in a couple of days

Regards

B DOREL


slam Bari a écrit:

>Dear all,
>
>I m sending Big XML files to server about 30 - 50MB size. I have configured
>SimpleXmlExtractor. When this extractor start parsing, Two things happend:-
>
>1. Either Out Of Memory error occured.
>2. Or The Request get Timed-Out.
>
>I see the Code of SimpleXMLExtractor, it is using SAXBuilder with JDOM. Can
>we use something else to speed up the process.
>
>Thanks...
>  
>




---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org


Re: Big XML file indexing out of Memory error

Posted by Aslam Bari <ia...@yahoo.co.in>.
Hi Dorel,
Is there is not any way to get that uri patch soon, Becoz i m in urgent.
Thanks...


Bruno Dorel wrote:
> 
> How lucky you are ! I've just 128 Meg It's why we developed the patch to 
> use Uri instead of bytes Arrays
> I 'm waiting for an anwser from de  slide-dev-list  the patch have been 
> posted since december 2006 and we are still waiting for an anwser from 
> the dev-list
> 
> Regards
> 
> B DOREL
> 
> aslam bari a écrit:
> 
>>Hi Miguel,
>>Thanks for reply, But i know i can increase the heap size, but problem is
that i can use only 512m max heap size., Not allowed more.
>>
>>set JAVA_OPTS= -Xmx512m
>>But what if 35-50MB xml file come in extractor, will it gone out of
memory.
>>
>>
>>----- Original Message ----
>>From: Miguel Figueiredo <mf...@maisis.pt>
>>To: Slide Users Mailing List <sl...@jakarta.apache.org>
>>Sent: Monday, 22 January, 2007 3:35:18 PM
>>Subject: RE: Big XML file indexing out of Memory error
>>
>>
>>Hello Aslam,
>>
>>Not sure if you already tried this, but set this options to the JVM:
>>
>>-Xms128m -Xmx768m -XX:MaxPermSize=256m
>>
>>Hope this helps,
>>Miguel Figueiredo
>>
>>-----Original Message-----
>>From: Aslam Bari [mailto:iamaslamok@yahoo.co.in] 
>>Sent: sábado, 20 de Janeiro de 2007 12:05
>>To: slide-user@jakarta.apache.org
>>Subject: Big XML file indexing out of Memory error
>>
>>
>>Dear all,
>>
>>I m sending Big XML files to server about 30 - 50MB size. I have
configured
>>SimpleXmlExtractor. When this extractor start parsing, Two things
happend:-
>>
>>1. Either Out Of Memory error occured.
>>2. Or The Request get Timed-Out.
>>
>>I see the Code of SimpleXMLExtractor, it is using SAXBuilder with JDOM.
Can
>>we use something else to speed up the process.
>>
>>Thanks...
>>  
>>
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: slide-user-help@jakarta.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Big-XML-file-indexing-out-of-Memory-error-tf3044992.html#a8515161
Sent from the Jakarta Slide - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-user-help@jakarta.apache.org