You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "K.A.Hussain Ali" <Hu...@photoninfotech.com> on 2005/12/20 16:58:53 UTC
"Out of memory exception"-while updating
HI all
I am Nutch to crawl some site but i get an "Out Of Memory Error"
when i try updating the webdb with some good amount of URL's
I tried to find some solution on the mailing list but find nothing for solution
Could anyone put their suggestion over this ?
How much of RAM do Nutch requires for proper updation and indexing with a lack of URL's ?
Any help would be greatly appreciated
regards
-Hussain.
Re: "Out of memory exception"-while updating
Posted by "Håvard W. Kongsgård" <h....@niap.no>.
<property>
<name>indexer.max.tokens</name>
<value>10000</value>
<description>
The maximum number of tokens that will be indexed for a single field
in a document. This limits the amount of memory required for
indexing, so that collections with very large files will not crash
the indexing process by running out of memory.
Note that this effectively truncates large documents, excluding
from the index tokens that occur further in the document. If you
know your source documents are large, be sure to set this value
high enough to accomodate the expected size. If you set it to
Integer.MAX_VALUE, then the only limit is your memory, but you
should anticipate an OutOfMemoryError.
</description>
</property>
http://wiki.media-style.com/display/nutchDocu/Hardware
K.A.Hussain Ali wrote:
>HI all
>
> I am Nutch to crawl some site but i get an "Out Of Memory Error"
> when i try updating the webdb with some good amount of URL's
>
>I tried to find some solution on the mailing list but find nothing for solution
>
>Could anyone put their suggestion over this ?
>
>How much of RAM do Nutch requires for proper updation and indexing with a lack of URL's ?
>
>Any help would be greatly appreciated
>
>regards
>-Hussain.
>
>
>
>------------------------------------------------------------------------
>
>No virus found in this incoming message.
>Checked by AVG Free Edition.
>Version: 7.1.371 / Virus Database: 267.14.1/207 - Release Date: 19.12.2005
>
>