You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Marseld Dedgjonaj <ma...@ikubinfo.com> on 2011/01/04 13:27:41 UTC
Exception on segment merging
Hello everybody,
I have configured nutch-1.2 to crawl all urls of a specific website.
It runs fine for a while but now that the number of indexed urls has grown
more than 30'000, I got an exception on segment merging.
Have anybody seen this kind of error.
The exception is shown below.
Slice size: 50000 URLs.
Slice size: 50000 URLs.
Slice size: 50000 URLs.
Slice size: 50000 URLs.
Slice size: 50000 URLs.
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
at
org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:638)
at
org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:683)
Merge Segments- End at: 04-01-2011 07:40:48
Thanks in advance & Best Regards,
Marseldi
<p class="MsoNormal"><span style="color: rgb(31, 73, 125);">Gjeni <b>Punë të Mirë</b> dhe <b>të Mirë për Punë</b>... Vizitoni: <a target="_blank" href="http://www.punaime.al/">www.punaime.al</a></span></p>
<p><a target="_blank" href="http://www.punaime.al/"><span style="text-decoration: none;"><img width="165" height="31" border="0" alt="punaime" src="http://www.ikub.al/images/punaime.al_small.png" /></span></a></p>