You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Dennis Kubes <ku...@apache.org> on 2007/07/19 02:13:42 UTC
Multiple Job Jar files?
Is it possible to have multiple job jar files being submitted to hadoop
at once? If not, is this a feature that might be useful?
I can see this being useful for custom Nutch development, having a nutch
job.jar and a custom.job.jar file.
Dennis Kubes
Re: Multiple Job Jar files?
Posted by Doug Cutting <cu...@apache.org>.
Dennis Kubes wrote:
> Ok, I have completed the testing and it is working good. One problem
> though. I noticed that we are using a distributed cache for the job
> files. If I am creating new job jar files on the fly, but still copying
> them to the job.jar location, how is this affected by distributed caching?
The cache will still be effective. Typically, in the course of a job,
multiple map tasks and multiple reduce tasks run on each host. The
cache retrieves just a single copy of the job's jar for all of these tasks.
However, with a new jar per job, the cache will not be effective across
jobs. But this is not nearly as critical as caching across tasks in a
job, since there are typically thousands of tasks per job. One could
attempt to optimize across jobs, but I think that would be overkill,
especially for the first version of this feature.
Doug
Re: Multiple Job Jar files?
Posted by Dennis Kubes <ku...@apache.org>.
Ok, I have completed the testing and it is working good. One problem
though. I noticed that we are using a distributed cache for the job
files. If I am creating new job jar files on the fly, but still copying
them to the job.jar location, how is this affected by distributed caching?
Dennis Kubes
Dennis Kubes wrote:
> Ok, I read the JIRA and have been hacking away at this for the past
> couple of hours. I have a workable patch for that I just need to test.
> It follows what the JIRA proposed to create a master job.jar file from
> multiple job jar files passed. I will test and post tomorrow morning.
>
> Dennis Kubes
>
> Runping Qi wrote:
>> Yes, definitely. There is a JIRA opened precisely for that:
>> https://issues.apache.org/jira/browse/HADOOP-1622
>>
>> Runping Qi
>>
>>
>>> -----Original Message-----
>>> From: Dennis Kubes [mailto:kubes@apache.org]
>>> Sent: Wednesday, July 18, 2007 5:14 PM
>>> To: hadoop-user@lucene.apache.org
>>> Subject: Multiple Job Jar files?
>>>
>>> Is it possible to have multiple job jar files being submitted to hadoop
>>> at once? If not, is this a feature that might be useful?
>>>
>>> I can see this being useful for custom Nutch development, having a nutch
>>> job.jar and a custom.job.jar file.
>>>
>>> Dennis Kubes
>>
Re: Multiple Job Jar files?
Posted by Dennis Kubes <ku...@apache.org>.
Ok, I read the JIRA and have been hacking away at this for the past
couple of hours. I have a workable patch for that I just need to test.
It follows what the JIRA proposed to create a master job.jar file from
multiple job jar files passed. I will test and post tomorrow morning.
Dennis Kubes
Runping Qi wrote:
> Yes, definitely. There is a JIRA opened precisely for that:
> https://issues.apache.org/jira/browse/HADOOP-1622
>
> Runping Qi
>
>
>> -----Original Message-----
>> From: Dennis Kubes [mailto:kubes@apache.org]
>> Sent: Wednesday, July 18, 2007 5:14 PM
>> To: hadoop-user@lucene.apache.org
>> Subject: Multiple Job Jar files?
>>
>> Is it possible to have multiple job jar files being submitted to hadoop
>> at once? If not, is this a feature that might be useful?
>>
>> I can see this being useful for custom Nutch development, having a nutch
>> job.jar and a custom.job.jar file.
>>
>> Dennis Kubes
>
RE: Multiple Job Jar files?
Posted by Runping Qi <ru...@yahoo-inc.com>.
Yes, definitely. There is a JIRA opened precisely for that:
https://issues.apache.org/jira/browse/HADOOP-1622
Runping Qi
> -----Original Message-----
> From: Dennis Kubes [mailto:kubes@apache.org]
> Sent: Wednesday, July 18, 2007 5:14 PM
> To: hadoop-user@lucene.apache.org
> Subject: Multiple Job Jar files?
>
> Is it possible to have multiple job jar files being submitted to hadoop
> at once? If not, is this a feature that might be useful?
>
> I can see this being useful for custom Nutch development, having a nutch
> job.jar and a custom.job.jar file.
>
> Dennis Kubes