You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Zhang, jian" <jz...@freewheel.tv> on 2008/02/21 10:00:09 UTC
Questions about namenode and JobTracker configuration.
Hi, All
I have a small question about configuration.
In Hadoop Documentation page, it says
" Typically you choose one machine in the cluster to act as the NameNode
and one machine as to act as the JobTracker, exclusively. The rest of
the machines act as both a DataNode and TaskTracker and are referred to
as slaves."
Does that mean the JobTracker is not a slave as NameNode ?
NameNode and DataNode form the HDFS. Since the JobTracker needs to
interact with TaskTracker which resides in HDFS, to make the
communication easier, I think it should be at least part of the HDFS.
Best Regards
Jian Zhang
Re: Questions about namenode and JobTracker configuration.
Posted by Amar Kamat <am...@yahoo-inc.com>.
Zhang, jian wrote:
> Hi, All
>
>
>
> I have a small question about configuration.
>
>
>
> In Hadoop Documentation page, it says
>
> " Typically you choose one machine in the cluster to act as the NameNode
> and one machine as to act as the JobTracker, exclusively. The rest of
> the machines act as both a DataNode and TaskTracker and are referred to
> as slaves."
>
>
>
> Does that mean the JobTracker is not a slave as NameNode ?
>
>
>
JobTracker and Namenode are daemons on a machine (frequently called as
masters). The master node can also act as a slave node. JobTracker and
Namenode basically do the book-keeping/scheduling work. On a large
cluster the load on the JobTracker/Namenode is usually high. Hence its
recommended to run these daemons on a separate machine but this is not
mandatory.
> NameNode and DataNode form the HDFS. Since the JobTracker needs to
> interact with TaskTracker which resides in HDFS,
TaskTracker and DataNodes are processes on the slave nodes. TaskTracker
communicates with the JobTracker while DataNode communicates with the
Namenode. The DFS is designed in such a way that it can function without
mapreduce just for distributed storage. The TaskTracker never
communicates with the NameNode. Its the JobTracker that does. Mostly the
TaskTracker concentrates on doing the work locally i.e spawn JVMs for
doing the maps.
Amar
> to make the
> communication easier, I think it should be at least part of the HDFS.
>
>
>
> Best Regards
>
>
>
> Jian Zhang
>
>
>
>
>