You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "stack@archive.org (JIRA)" <ji...@apache.org> on 2006/07/29 06:27:14 UTC
[jira] Updated: (NUTCH-333) SegmentMerger and SegmentReader should
use NutchJob
[ http://issues.apache.org/jira/browse/NUTCH-333?page=all ]
stack@archive.org updated NUTCH-333:
------------------------------------
Attachment: nutch333.patch
Patch attached.
One last thing, as is, all works fine in 'standalone mode'. The problem happens in distributed mode.
> SegmentMerger and SegmentReader should use NutchJob
> ---------------------------------------------------
>
> Key: NUTCH-333
> URL: http://issues.apache.org/jira/browse/NUTCH-333
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 0.9-dev
> Reporter: stack@archive.org
> Priority: Minor
> Attachments: nutch333.patch
>
>
> I have a job jar that is nutch with additions. I can launch this job jar on a pure hadoop platform usually without issue. I can run nutch jobs -- update db, invert links, etc. -- without issue. Recently I tried to do the same with SegmentMerg'ing only it would fail complaining about ClassNotFound:
> 2006-07-28 20:43:54,371 WARN org.apache.hadoop.mapred.JobTracker: job init failed
> java.io.IOException: java.lang.ClassNotFoundException: org.apache.nutch.segment.SegmentMerger$ObjectInputFormat
> at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:130)
> at org.apache.hadoop.mapred.JobTracker$JobInitThread.run(JobTracker.java:310)
> at java.lang.Thread.run(Thread.java:595)
> java.io.IOException: Job failed!
> After digging and chatting today with Stefan, the SegmentMerger and SegmentReader classes are not like the others. Others make a new JobConf inside in their job setup by doing a 'new NutchJob' whereas Segment* does 'new JobConf'. Sure enough, if I make the change, all works.
> NutchJob triggers the setting of the job jar into the configuration (JobConf.findContainingJar is run). This doesn't happen for 'new JobConf'.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira