You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Zhang, jian" <jz...@freewheel.tv> on 2008/02/21 10:00:09 UTC

Questions about namenode and JobTracker configuration.

Hi, All

 

I have a small question about configuration.

 

In Hadoop Documentation page, it says 

" Typically you choose one machine in the cluster to act as the NameNode
and one machine as to act as the JobTracker, exclusively. The rest of
the machines act as both a DataNode and TaskTracker and are referred to
as slaves."

 

Does that mean the JobTracker is not a slave as NameNode ?

 

NameNode and DataNode form the HDFS. Since the JobTracker needs to
interact with TaskTracker which resides in HDFS, to make the
communication easier, I think it should be at least part of the HDFS.  

 

Best Regards

 

Jian Zhang

 


Re: Questions about namenode and JobTracker configuration.

Posted by Amar Kamat <am...@yahoo-inc.com>.
Zhang, jian wrote:
> Hi, All
>
>  
>
> I have a small question about configuration.
>
>  
>
> In Hadoop Documentation page, it says 
>
> " Typically you choose one machine in the cluster to act as the NameNode
> and one machine as to act as the JobTracker, exclusively. The rest of
> the machines act as both a DataNode and TaskTracker and are referred to
> as slaves."
>
>  
>
> Does that mean the JobTracker is not a slave as NameNode ?
>
>  
>   
JobTracker and Namenode are daemons on a machine (frequently called as 
masters). The master node can also act as a slave node. JobTracker and 
Namenode basically do the book-keeping/scheduling work. On a large 
cluster the load on the JobTracker/Namenode is usually high. Hence its 
recommended to run these daemons on a separate machine but this is not 
mandatory.
> NameNode and DataNode form the HDFS. Since the JobTracker needs to
> interact with TaskTracker which resides in HDFS, 
TaskTracker and DataNodes are processes on the slave nodes. TaskTracker 
communicates with the JobTracker while DataNode communicates with the 
Namenode. The DFS is designed in such a way that it can function without 
mapreduce just for distributed storage. The TaskTracker never 
communicates with the NameNode. Its the JobTracker that does. Mostly the 
TaskTracker concentrates on doing the work locally i.e spawn JVMs for 
doing the maps.
Amar
> to make the
> communication easier, I think it should be at least part of the HDFS.  
>
>  
>
> Best Regards
>
>  
>
> Jian Zhang
>
>  
>
>
>