You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Marseld Dedgjonaj <ma...@ikubinfo.com> on 2011/01/04 13:27:41 UTC

Exception on segment merging

Hello everybody,

I have configured nutch-1.2 to crawl all urls of a specific website. 

It runs fine for a while but now that the number of indexed urls has grown
more than 30'000,  I got an exception on segment merging.

Have anybody seen this kind of error.

 

The exception is shown below.

 

Slice size: 50000 URLs.


Slice size: 50000 URLs.


Slice size: 50000 URLs.


Slice size: 50000 URLs.


Slice size: 50000 URLs.


Exception in thread "main" java.io.IOException: Job failed!


        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)


        at
org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:638)


        at
org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:683)


Merge Segments-  End at:   04-01-2011 07:40:48     

 

Thanks in advance & Best Regards,

Marseldi



<p class="MsoNormal"><span style="color: rgb(31, 73, 125);">Gjeni <b>Pun&euml; t&euml; Mir&euml;</b> dhe <b>t&euml; Mir&euml; p&euml;r Pun&euml;</b>... Vizitoni: <a target="_blank" href="http://www.punaime.al/">www.punaime.al</a></span></p>
<p><a target="_blank" href="http://www.punaime.al/"><span style="text-decoration: none;"><img width="165" height="31" border="0" alt="punaime" src="http://www.ikub.al/images/punaime.al_small.png" /></span></a></p>