You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Stefan Groschupf <sg...@media-style.com> on 2006/03/03 19:07:22 UTC

classloading issue

Hi,

I have some problem with classloading in hadoop.
I had written a set of mapper and reducers with a set of custom  
writable implementations.
Also my classes require nutch writables.

I'm able to get things running locally but not distributed.
Since I have 2 jars that need to be in the classpath (my jar and  
nutch.jar) it is actually not clear to me how the get this  
distributed to the tasktrackers.

Is it possible that the tasktracker already require the custom  
writable in the classpath since until getting them from the job  
configuration they must be in the classpath?
Also I tried to replace my nutch dependencies by using objectwritable  
but this throws also an error since the SequenceFile.getValueClass is  
used instead of object writable.

So in short how to get my job processed that contains 2 jars without  
restarting all taksrackers?

Thanks for any hints.
Stefan

Re: classloading issue

Posted by Stefan Groschupf <sg...@media-style.com>.

> Does that help?

Yes very much, thanks a lot!

Stefan

Re: classloading issue

Posted by Doug Cutting <cu...@apache.org>.

Stefan Groschupf wrote:
> So in short how to get my job processed that contains 2 jars without  
> restarting all taksrackers?

You can put multiple jars in the lib directory of your job jar file. 
(Look in Nutch's build.xml for an example.)  Note also that NutchJob 
specfies that the job jar is the jar that contains the NutchJob class. 
You can either override that by calling JobConf.setJar() or by 
constructing your job's with 'new JobConf(My.class)' instead of 'new 
NutchJob()' or 'new JobConf()'.  Does that help?

Doug