You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Dennis Kubes <ku...@apache.org> on 2007/07/19 02:13:42 UTC

Multiple Job Jar files?

Is it possible to have multiple job jar files being submitted to hadoop 
at once?  If not, is this a feature that might be useful?

I can see this being useful for custom Nutch development, having a nutch 
job.jar and a custom.job.jar file.

Dennis Kubes

Re: Multiple Job Jar files?

Posted by Doug Cutting <cu...@apache.org>.
Dennis Kubes wrote:
> Ok, I have completed the testing and it is working good.  One problem 
> though.  I noticed that we are using a distributed cache for the job 
> files.  If I am creating new job jar files on the fly, but still copying 
> them to the job.jar location, how is this affected by distributed caching?

The cache will still be effective.  Typically, in the course of a job, 
multiple map tasks and multiple reduce tasks run on each host.  The 
cache retrieves just a single copy of the job's jar for all of these tasks.

However, with a new jar per job, the cache will not be effective across 
jobs.  But this is not nearly as critical as caching across tasks in a 
job, since there are typically thousands of tasks per job.  One could 
attempt to optimize across jobs, but I think that would be overkill, 
especially for the first version of this feature.

Doug

Re: Multiple Job Jar files?

Posted by Dennis Kubes <ku...@apache.org>.
Ok, I have completed the testing and it is working good.  One problem 
though.  I noticed that we are using a distributed cache for the job 
files.  If I am creating new job jar files on the fly, but still copying 
them to the job.jar location, how is this affected by distributed caching?

Dennis Kubes

Dennis Kubes wrote:
> Ok, I read the JIRA and have been hacking away at this for the past 
> couple of hours.  I have a workable patch for that I just need to test. 
>  It follows what the JIRA proposed to create a master job.jar file from 
> multiple job jar files passed.  I will test and post tomorrow morning.
> 
> Dennis Kubes
> 
> Runping Qi wrote:
>> Yes, definitely. There is a JIRA opened precisely for that:
>> https://issues.apache.org/jira/browse/HADOOP-1622
>>
>> Runping Qi
>>
>>
>>> -----Original Message-----
>>> From: Dennis Kubes [mailto:kubes@apache.org]
>>> Sent: Wednesday, July 18, 2007 5:14 PM
>>> To: hadoop-user@lucene.apache.org
>>> Subject: Multiple Job Jar files?
>>>
>>> Is it possible to have multiple job jar files being submitted to hadoop
>>> at once?  If not, is this a feature that might be useful?
>>>
>>> I can see this being useful for custom Nutch development, having a nutch
>>> job.jar and a custom.job.jar file.
>>>
>>> Dennis Kubes
>>

Re: Multiple Job Jar files?

Posted by Dennis Kubes <ku...@apache.org>.
Ok, I read the JIRA and have been hacking away at this for the past 
couple of hours.  I have a workable patch for that I just need to test. 
  It follows what the JIRA proposed to create a master job.jar file from 
multiple job jar files passed.  I will test and post tomorrow morning.

Dennis Kubes

Runping Qi wrote:
> Yes, definitely. There is a JIRA opened precisely for that:
> https://issues.apache.org/jira/browse/HADOOP-1622
> 
> Runping Qi
> 
> 
>> -----Original Message-----
>> From: Dennis Kubes [mailto:kubes@apache.org]
>> Sent: Wednesday, July 18, 2007 5:14 PM
>> To: hadoop-user@lucene.apache.org
>> Subject: Multiple Job Jar files?
>>
>> Is it possible to have multiple job jar files being submitted to hadoop
>> at once?  If not, is this a feature that might be useful?
>>
>> I can see this being useful for custom Nutch development, having a nutch
>> job.jar and a custom.job.jar file.
>>
>> Dennis Kubes
> 

RE: Multiple Job Jar files?

Posted by Runping Qi <ru...@yahoo-inc.com>.
Yes, definitely. There is a JIRA opened precisely for that:
https://issues.apache.org/jira/browse/HADOOP-1622

Runping Qi


> -----Original Message-----
> From: Dennis Kubes [mailto:kubes@apache.org]
> Sent: Wednesday, July 18, 2007 5:14 PM
> To: hadoop-user@lucene.apache.org
> Subject: Multiple Job Jar files?
> 
> Is it possible to have multiple job jar files being submitted to hadoop
> at once?  If not, is this a feature that might be useful?
> 
> I can see this being useful for custom Nutch development, having a nutch
> job.jar and a custom.job.jar file.
> 
> Dennis Kubes