You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "K.A.Hussain Ali" <Hu...@photoninfotech.com> on 2005/12/20 16:58:53 UTC

"Out of memory exception"-while updating

HI all

    I am Nutch to crawl some site but i get an "Out Of Memory Error"
    when i try updating the webdb with some good amount of URL's

I tried to find some solution on the mailing list but find nothing for solution

Could anyone put their suggestion over this ?

How much of RAM do Nutch requires for proper updation and indexing with a lack of URL's ?

Any help would be greatly appreciated

regards
-Hussain.

Re: "Out of memory exception"-while updating

Posted by "Håvard W. Kongsgård" <h....@niap.no>.
<property>
  <name>indexer.max.tokens</name>
  <value>10000</value>
  <description>
  The maximum number of tokens that will be indexed for a single field
  in a document. This limits the amount of memory required for
  indexing, so that collections with very large files will not crash
  the indexing process by running out of memory.

  Note that this effectively truncates large documents, excluding
  from the index tokens that occur further in the document. If you
  know your source documents are large, be sure to set this value
  high enough to accomodate the expected size. If you set it to
  Integer.MAX_VALUE, then the only limit is your memory, but you
  should anticipate an OutOfMemoryError.
  </description>
</property>

http://wiki.media-style.com/display/nutchDocu/Hardware

K.A.Hussain Ali wrote:

>HI all
>
>    I am Nutch to crawl some site but i get an "Out Of Memory Error"
>    when i try updating the webdb with some good amount of URL's
>
>I tried to find some solution on the mailing list but find nothing for solution
>
>Could anyone put their suggestion over this ?
>
>How much of RAM do Nutch requires for proper updation and indexing with a lack of URL's ?
>
>Any help would be greatly appreciated
>
>regards
>-Hussain.
>
>  
>
>------------------------------------------------------------------------
>
>No virus found in this incoming message.
>Checked by AVG Free Edition.
>Version: 7.1.371 / Virus Database: 267.14.1/207 - Release Date: 19.12.2005
>  
>