You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Nicholas W <44...@log1.net> on 2013/05/23 10:47:50 UTC

OutOfMemoryError for bin/nutch elasticindex ocpnutch -all

Dear List,
 I have been following the instructions at
http://wiki.apache.org/nutch/Nutch2Tutorial to see if I can get a nutch
installation running with ElasticSearch. I have successfully done a crawl
with no real issues, but then when I try and load the results into
elasticsearch I run into trouble.

I issue the command:

bin/nutch elasticindex ocpnutch -all
And it waits around for a long time and then comes back with an error:
Exception in thread "main" java.lang.RuntimeException: job failed:
name=elastic-index [ocpnutch], jobid=job_local_0001

If I look in the logs at:

~/apache-nutch-2.1/runtime/local/logs/hadoop.log
I see several errors like this:
Exception caught on netty layer [[id: 0x569764bd, /192.168.17.39:52554 => /
192.168.17.60:9300]]
java.lang.OutOfMemoryError: Java heap space

There is nothing in the logs on the elastic search.

I have tried changing:
elastic.max.bulk.docs and elastic.max.bulk.size to small sizes
and allocating large amounts of GB to nutch, but to no avail.

The jvm is:
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)

Does anyone have any idea what I am doing wrong - what other diagnostic
information would be helpful to solve this problem?

Thanks a lot,
Regards,
Nicholas W.