You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Markus Jelsma <ma...@openindex.io> on 2012/05/03 08:32:16 UTC
Re: Generator OOM
FYI, i checked to code and it's indeed the host or domain limit that's
responsible for the OOM. The score is ok as it's not being accumulated
anyway. Anyone can work around the problem by either increasing the heap
space allocated to the reducers or, significantly increase the number of
reducers or, slightly increase the host or domain limit value.
On Thu, 26 Apr 2012 21:02:58 +0200, Markus Jelsma
<ma...@openindex.io> wrote:
> Hi,
>
> We sometimes see the generator running OOM. This happens because we
> either have a too high topN value or too many segments to generate.
> In
> any case, a very large amount of records is being generated with the
> same (lowest) score and end up in a single reducer. We limit the
> generator by domain which may be a source of trouble.
>
> I've not yet found a way around this problem so i'm looking for
> suggestions.
>
> Thanks,
> Markus