You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Emmanuel <jo...@gmail.com> on 2007/10/03 16:33:14 UTC

Mergesegs error

I try to merge 2 segments into 1. I've a cluster of 4 machine using hadoop
13.1 and the last trunk of nutch.
Every time i start my merge I've got the following error:
2007-10-03 22:06:28,272 INFO  mapred.TaskInProgress - Error from
task_0001_m_000011_0: java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2786)
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java
:94)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at java.io.FilterOutputStream.write(FilterOutputStream.java:80)
        at org.apache.nutch.protocol.Content.write(Content.java:164)
        at org.apache.hadoop.io.GenericWritable.write(GenericWritable.java
:100)
        at org.apache.nutch.metadata.MetaWrapper.write(MetaWrapper.java:107)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(
MapTask.java:365)
        at org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java
:331)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:186)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
:1707)

I've increase the memory but it doesn't seems to change anything to my pb.
This error appears immediatly at the beginning of the process.

Do you experinced the same issue ?
Does anybody use mergesegs ?

Thanks in advance for your help