You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "K.A.Hussain Ali" <Hu...@photoninfotech.com> on 2005/12/26 09:27:46 UTC

"Out of memor error" while updating

HI all,

I am using Nutch to crawl few sites and when i crawl for certain depth and do updation of webdb

while updating the webdb i get an "Out of Memory error"

I increased the jvm size using java_opts and even reduced the token size of per page in the nutch-default.xml but still i get such an error.

I am using tomcat and i have only one application running on it.

what is the system requirement of Nutch to get rid of this error ?

I even tried things mentioned in the mailing list but nothing turns to be fruitful.

Any help is greatly appreciated.
Thanks in advance

regards
-Hussain.

Re: "Out of memor error" while updating

Posted by Stefan Groschupf <sg...@media-style.com>.
> Should i change the value of   'io.sort.mb' and or io.sort.factor ?
> and if so what should i change to so to eliminate the  error?
Yes, since it looks like it crah until sorting.
>
> Also is there any minimum requirement of RAM for nutch to do  
> indexing and searching ?

Well, not really but you should have 1 GB RAM if you want to do  
serious things.
You can setup the memory:
from the bin/nutch script:
#   NUTCH_HEAPSIZE  The maximum amount of heap to use, in MB.
#                   Default is 1000.

...
JAVA_HEAP_MAX=-Xmx1000m

HTH
Stefan
> Any help is greatly appreciated
> Thanks in advance
>
> regards
> -Hussain.
>
>
>
> ----- Original Message ----- From: "Stefan Groschupf" <sg@media- 
> style.com>
> To: <nu...@lucene.apache.org>
> Sent: Monday, December 26, 2005 7:18 PM
> Subject: Re: "Out of memor error" while updating
>
>
>> Do you have a stack trace?
>> Is it may related to a 'too many file open Exception?'.
>> Also you can try to minimalize 'io.sort.mb' and or io.sort.factor.
>>
>> Stefan
>>
>> Am 26.12.2005 um 09:27 schrieb K.A.Hussain Ali:
>>
>>> HI all,
>>>
>>> I am using Nutch to crawl few sites and when i crawl for certain
>>> depth and do updation of webdb
>>>
>>> while updating the webdb i get an "Out of Memory error"
>>>
>>> I increased the jvm size using java_opts and even reduced the token
>>> size of per page in the nutch-default.xml but still i get such an
>>> error.
>>>
>>> I am using tomcat and i have only one application running on it.
>>>
>>> what is the system requirement of Nutch to get rid of this error ?
>>>
>>> I even tried things mentioned in the mailing list but nothing turns
>>> to be fruitful.
>>>
>>> Any help is greatly appreciated.
>>> Thanks in advance
>>>
>>> regards
>>> -Hussain.
>>
>> ---------------------------------------------------------------
>> company:        http://www.media-style.com
>> forum:        http://www.text-mining.org
>> blog:            http://www.find23.net
>>
>>
>
>


Re: "Out of memor error" while updating

Posted by "K.A.Hussain Ali" <Hu...@photoninfotech.com>.
Hi Stefan and all,

This is the message i get when i catch the exception

051225 161555 status: segment 20051225122501, 1371 pages, 85 errors, 
50758960 bytes, 13849896 ms
051225 161555 status: 0.09898991 pages/s, 28.6323 kb/s, 37023.312 bytes/page
051225 161556 Updating C:/test_dir/crawl_dir/db
051225 161556 Updating for C:/test_dir/crawl_dir/segments/20051225122501
051225 161556 Processing document 0
051225 161656 Processing document 1000
051225 161723 Finishing update
java.lang.OutOfMemoryError

As you said , i didnt even search and while updating the webdb i get this 
exception.
I just reduced the number of threads and increased the number of retry in 
the nutch-site.xml

I didnt change any other options.

Should i change the value of   'io.sort.mb' and or io.sort.factor ?
and if so what should i change to so to eliminate the  error?

Also is there any minimum requirement of RAM for nutch to do indexing and 
searching ?

Any help is greatly appreciated
Thanks in advance

regards
-Hussain.



----- Original Message ----- 
From: "Stefan Groschupf" <sg...@media-style.com>
To: <nu...@lucene.apache.org>
Sent: Monday, December 26, 2005 7:18 PM
Subject: Re: "Out of memor error" while updating


> Do you have a stack trace?
> Is it may related to a 'too many file open Exception?'.
> Also you can try to minimalize 'io.sort.mb' and or io.sort.factor.
>
> Stefan
>
> Am 26.12.2005 um 09:27 schrieb K.A.Hussain Ali:
>
>> HI all,
>>
>> I am using Nutch to crawl few sites and when i crawl for certain
>> depth and do updation of webdb
>>
>> while updating the webdb i get an "Out of Memory error"
>>
>> I increased the jvm size using java_opts and even reduced the token
>> size of per page in the nutch-default.xml but still i get such an
>> error.
>>
>> I am using tomcat and i have only one application running on it.
>>
>> what is the system requirement of Nutch to get rid of this error ?
>>
>> I even tried things mentioned in the mailing list but nothing turns
>> to be fruitful.
>>
>> Any help is greatly appreciated.
>> Thanks in advance
>>
>> regards
>> -Hussain.
>
> ---------------------------------------------------------------
> company:        http://www.media-style.com
> forum:        http://www.text-mining.org
> blog:            http://www.find23.net
>
>
> 


Re: "Out of memor error" while updating

Posted by Stefan Groschupf <sg...@media-style.com>.
Do you have a stack trace?
Is it may related to a 'too many file open Exception?'.
Also you can try to minimalize 'io.sort.mb' and or io.sort.factor.

Stefan

Am 26.12.2005 um 09:27 schrieb K.A.Hussain Ali:

> HI all,
>
> I am using Nutch to crawl few sites and when i crawl for certain  
> depth and do updation of webdb
>
> while updating the webdb i get an "Out of Memory error"
>
> I increased the jvm size using java_opts and even reduced the token  
> size of per page in the nutch-default.xml but still i get such an  
> error.
>
> I am using tomcat and i have only one application running on it.
>
> what is the system requirement of Nutch to get rid of this error ?
>
> I even tried things mentioned in the mailing list but nothing turns  
> to be fruitful.
>
> Any help is greatly appreciated.
> Thanks in advance
>
> regards
> -Hussain.

---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net