You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by prajyot bankade <pr...@gmail.com> on 2010/04/28 05:32:25 UTC

Regarding Job tracker

Hello Everyone,

I have just started reading about hadoop job tracker. In one book I read
that there is only one job tracker who is responsible to distribute task to
worker system. Please make me right if i say some thing wrong.

I have few questions,

why there is only one job tracker?
What will happen if that job tracker will be fail / crash?
Can we have more then one job tracker?
Can i create my own backup job tracker to support the system if job tracker
get crash?

This is my first mail, please help me in this.

Thanks and Regard,

Prajyot

Re: Regarding Job tracker

Posted by Brian Bockelman <bb...@cse.unl.edu>.
Interesting!  Here's what the Condor folks have been doing with MapReduce:

http://www.cs.wisc.edu/condor/CondorWeek2010/condor-presentations/thain-condor-hadoop.pdf

Dunno why we don't see more of them (maybe it's just because I'm not subscribed to the MAPREDUCE mailing list?  I have too many emails...).

Oh well... my point was that one can have multiple schedulers on a single cluster, have great utilization, and increase your reliability.  It has its disadvantages too - multiple schedulers while preserving data locality is tough, it's "yet another component" for small sites.

Brian

On Apr 28, 2010, at 12:49 PM, Arun C Murthy wrote:

> 
>> They have gotten lots of mileage out of breaking the scheduling and the resource provision into two different components.  Having multiple jobtrackers would be very advantageous if it didn't require you to partition your pool.
>> 
> 
> https://issues.apache.org/jira/browse/MAPREDUCE-279
> 
> Arun


Re: Regarding Job tracker

Posted by Arun C Murthy <ac...@yahoo-inc.com>.
> They have gotten lots of mileage out of breaking the scheduling and  
> the resource provision into two different components.  Having  
> multiple jobtrackers would be very advantageous if it didn't require  
> you to partition your pool.
>

https://issues.apache.org/jira/browse/MAPREDUCE-279

Arun

Re: Regarding Job tracker

Posted by Brian Bockelman <bb...@cse.unl.edu>.
On Apr 28, 2010, at 5:04 AM, Steve Loughran wrote:

> prajyot bankade wrote:
>> Hello Everyone,
>> I have just started reading about hadoop job tracker. In one book I read
>> that there is only one job tracker who is responsible to distribute task to
>> worker system. Please make me right if i say some thing wrong.
>> I have few questions,
>> why there is only one job tracker?
> 
> to provide a single place to make scheduling decisions
> 
(thread hijack)

Why is this an advantage? (I mean, I know it's an advantage in terms of the current architecture... just indulging in some blue-sky thinking here).

One of the projects I work with is the Condor Project out of Madison:

http://www.cs.wisc.edu/condor/

who have been building a distributed computing infrastructure for about 20 years.  Here is one of my favorite "overview" papers of theirs:

http://www.cs.wisc.edu/condor/doc/condor-practice.pdf  (my favorite is sections 4, 5, and 6.2)

They have gotten lots of mileage out of breaking the scheduling and the resource provision into two different components.  Having multiple jobtrackers would be very advantageous if it didn't require you to partition your pool.

One use case would be to separate out the "production work" from "research activities".  You could have a 'production jobtracker' which is accessible to a small number of users and has "known good", pre-approved, business-critical workflows and a 'research jobtracker' which more folks are allowed to use without pre-approved workflows.  This way, if a researcher accidentally crashes the jobtracker in the middle of the night, your business-critical work continues.

There's plenty of merit to the idea and worth thinking about.

Brian


Re: Regarding Job tracker

Posted by Steve Loughran <st...@apache.org>.
prajyot bankade wrote:
> Hello Everyone,
> 
> I have just started reading about hadoop job tracker. In one book I read
> that there is only one job tracker who is responsible to distribute task to
> worker system. Please make me right if i say some thing wrong.
> 
> I have few questions,
> 
> why there is only one job tracker?

to provide a single place to make scheduling decisions

> What will happen if that job tracker will be fail / crash?

Look at the source. You will find it saves state to the filesystem

> Can we have more then one job tracker?

yes, if you partition up your workers and bind them to different JTs, 
you can have >1 JT per HDFS filesystem, but it complicates locality, as 
each JT only schedules work to its workers. I hope your network cables 
are fat enough.

> Can i create my own backup job tracker to support the system if job tracker
> get crash?

Better to monitor the health of the JT and restart that service/machine 
when it goes down. As it serves up http pages, it is fairly easy to 
detect a complete failure. Harder to detect situations in which jobs get 
submitted but never executed, test jobs can do that for you