You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Arun C Murthy <ac...@yahoo-inc.com> on 2010/03/11 18:47:39 UTC
Re: Why hadoop jobs need setup and cleanup phases which would consume a lot of time ?
The daemons (JobTracker / TaskTracker) should not run _any_ user code
for their security and integrity, hence the setup/cleanup tasks.
As more jobs are submitted you have very few slots on your 10-node
cluster and hence the 'percieved' slowness - this will have the same
effect on jobs whether setup/cleanup tasks are run or not.
Note: There is a *single* setup task at the beginning of the job and a
*single* cleanup task at the end of the job, these are not per map-
task or per reduce-task.
Arun
On Mar 11, 2010, at 5:56 AM, Guo Leitao wrote:
> From our test of hadoop-0.20.1 on 10 nodes, we find the setup period
> is
> longer as more jobs are submitted. I don't know why maptask for
> setup is
> needed, why not jobtracker or one thread takes over this work?
>
> 2010/3/11 Jeff Zhang <zj...@gmail.com>
>
>> Hi Zhou,
>>
>> I look at the source code, it seems it is the JobTracker initiate
>> the
>> setup
>> and cleanup task.
>> And why do you think the setup and cleanup phases consume a lot of
>> time,
>> actually the time cost is depend on the OutputCommitter
>>
>>
>>
>>
>> On Thu, Mar 11, 2010 at 11:04 AM, Min Zhou <co...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> Why hadoop jobs need setup and cleanup phases which would consume a
>>> lot of time ? Why could not us archieve it like a distributed RDBMS
>>> does a master process coordinates all salve nodes through socket.
>>> I think that will save plenty of time if there won't be any setups
>>> and
>>> cleanups. What's hadoop philosophy on this?
>>>
>>> Thanks,
>>> Min
>>> --
>>> My research interests are distributed systems, parallel computing
>>> and
>>> bytecode based virtual machine.
>>>
>>> My profile:
>>> http://www.linkedin.com/in/coderplay
>>> My blog:
>>> http://coderplay.javaeye.com
>>>
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>