You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Diwakar Sharma <di...@gmail.com> on 2013/04/16 20:04:11 UTC

Get Hadoop cluster topology

I understand that when Namenode starts up it reads fsimage to get the state
of HDFS and applies the edits file to complete it.

But how about the cluster topology ? Does the namenode read the config
files like core-site.xml/slaves/... etc to determine its cluster topology
or uses an API to build it.


Thanks
Diwakar

Re: Get Hadoop cluster topology

Posted by Nikhil <mn...@gmail.com>.
>From http://archive.cloudera.com/cdh/3/hadoop/hdfs_user_guide.html
(Assuming you are using Cloudera Hadoop Distribution 3)

$ hadoop dfsadmin -refreshNodes # would help do the same.

-refreshNodes : Updates the set of hosts allowed to connect to namenode.
Re-reads the config file to update values defined by dfs.hosts and
dfs.host.exclude and reads the entires (hostnames) in those files. Each
entry not defined in dfs.hosts but in dfs.hosts.exclude is decommissioned.
Each entry defined in dfs.hosts and also in dfs.host.exclude is stopped
from decommissioning if it has aleady been marked for decommission. Entires
not present in both the lists are decommissioned.

There is also -printTopology switch useful to look at the current topology
view.

-printTopology : Print the topology of the cluster. Display a tree of racks
and datanodes attached to the tracks as viewed by the NameNode.

In most cases, however, I have seen that updating the topology with wrong
information such as rackno, tabs/spaces would get the master services in
soup and in such cases, it would mandate a restart.
I have tried looking for ways to refresh of the topology cache on both
namenode/jobtracker without the need for bouncing, however this can get
little tricky.

for more information, see:
http://grokbase.com/t/hadoop/common-user/121yqsme6v/refresh-namenode-topology-cache
.



On Tue, Apr 16, 2013 at 11:39 PM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

>
> On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma <diwakar.hadoop@gmail.com
> > wrote:
>
>> uster topology or uses an API to build it.
>
>
> If you stop and start the cluster Hadoop Reads thes configuration files
> for sure.
>
>
>
> ∞
> Shashwat Shriparv
>
>

Re: Get Hadoop cluster topology

Posted by Nikhil <mn...@gmail.com>.
>From http://archive.cloudera.com/cdh/3/hadoop/hdfs_user_guide.html
(Assuming you are using Cloudera Hadoop Distribution 3)

$ hadoop dfsadmin -refreshNodes # would help do the same.

-refreshNodes : Updates the set of hosts allowed to connect to namenode.
Re-reads the config file to update values defined by dfs.hosts and
dfs.host.exclude and reads the entires (hostnames) in those files. Each
entry not defined in dfs.hosts but in dfs.hosts.exclude is decommissioned.
Each entry defined in dfs.hosts and also in dfs.host.exclude is stopped
from decommissioning if it has aleady been marked for decommission. Entires
not present in both the lists are decommissioned.

There is also -printTopology switch useful to look at the current topology
view.

-printTopology : Print the topology of the cluster. Display a tree of racks
and datanodes attached to the tracks as viewed by the NameNode.

In most cases, however, I have seen that updating the topology with wrong
information such as rackno, tabs/spaces would get the master services in
soup and in such cases, it would mandate a restart.
I have tried looking for ways to refresh of the topology cache on both
namenode/jobtracker without the need for bouncing, however this can get
little tricky.

for more information, see:
http://grokbase.com/t/hadoop/common-user/121yqsme6v/refresh-namenode-topology-cache
.



On Tue, Apr 16, 2013 at 11:39 PM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

>
> On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma <diwakar.hadoop@gmail.com
> > wrote:
>
>> uster topology or uses an API to build it.
>
>
> If you stop and start the cluster Hadoop Reads thes configuration files
> for sure.
>
>
>
> ∞
> Shashwat Shriparv
>
>

Re: Get Hadoop cluster topology

Posted by Nikhil <mn...@gmail.com>.
>From http://archive.cloudera.com/cdh/3/hadoop/hdfs_user_guide.html
(Assuming you are using Cloudera Hadoop Distribution 3)

$ hadoop dfsadmin -refreshNodes # would help do the same.

-refreshNodes : Updates the set of hosts allowed to connect to namenode.
Re-reads the config file to update values defined by dfs.hosts and
dfs.host.exclude and reads the entires (hostnames) in those files. Each
entry not defined in dfs.hosts but in dfs.hosts.exclude is decommissioned.
Each entry defined in dfs.hosts and also in dfs.host.exclude is stopped
from decommissioning if it has aleady been marked for decommission. Entires
not present in both the lists are decommissioned.

There is also -printTopology switch useful to look at the current topology
view.

-printTopology : Print the topology of the cluster. Display a tree of racks
and datanodes attached to the tracks as viewed by the NameNode.

In most cases, however, I have seen that updating the topology with wrong
information such as rackno, tabs/spaces would get the master services in
soup and in such cases, it would mandate a restart.
I have tried looking for ways to refresh of the topology cache on both
namenode/jobtracker without the need for bouncing, however this can get
little tricky.

for more information, see:
http://grokbase.com/t/hadoop/common-user/121yqsme6v/refresh-namenode-topology-cache
.



On Tue, Apr 16, 2013 at 11:39 PM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

>
> On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma <diwakar.hadoop@gmail.com
> > wrote:
>
>> uster topology or uses an API to build it.
>
>
> If you stop and start the cluster Hadoop Reads thes configuration files
> for sure.
>
>
>
> ∞
> Shashwat Shriparv
>
>

Re: Get Hadoop cluster topology

Posted by Nikhil <mn...@gmail.com>.
>From http://archive.cloudera.com/cdh/3/hadoop/hdfs_user_guide.html
(Assuming you are using Cloudera Hadoop Distribution 3)

$ hadoop dfsadmin -refreshNodes # would help do the same.

-refreshNodes : Updates the set of hosts allowed to connect to namenode.
Re-reads the config file to update values defined by dfs.hosts and
dfs.host.exclude and reads the entires (hostnames) in those files. Each
entry not defined in dfs.hosts but in dfs.hosts.exclude is decommissioned.
Each entry defined in dfs.hosts and also in dfs.host.exclude is stopped
from decommissioning if it has aleady been marked for decommission. Entires
not present in both the lists are decommissioned.

There is also -printTopology switch useful to look at the current topology
view.

-printTopology : Print the topology of the cluster. Display a tree of racks
and datanodes attached to the tracks as viewed by the NameNode.

In most cases, however, I have seen that updating the topology with wrong
information such as rackno, tabs/spaces would get the master services in
soup and in such cases, it would mandate a restart.
I have tried looking for ways to refresh of the topology cache on both
namenode/jobtracker without the need for bouncing, however this can get
little tricky.

for more information, see:
http://grokbase.com/t/hadoop/common-user/121yqsme6v/refresh-namenode-topology-cache
.



On Tue, Apr 16, 2013 at 11:39 PM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

>
> On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma <diwakar.hadoop@gmail.com
> > wrote:
>
>> uster topology or uses an API to build it.
>
>
> If you stop and start the cluster Hadoop Reads thes configuration files
> for sure.
>
>
>
> ∞
> Shashwat Shriparv
>
>

Re: Get Hadoop cluster topology

Posted by shashwat shriparv <dw...@gmail.com>.
On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma
<di...@gmail.com>wrote:

> uster topology or uses an API to build it.


If you stop and start the cluster Hadoop Reads thes configuration files for
sure.



∞
Shashwat Shriparv

Re: Get Hadoop cluster topology

Posted by shashwat shriparv <dw...@gmail.com>.
On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma
<di...@gmail.com>wrote:

> uster topology or uses an API to build it.


If you stop and start the cluster Hadoop Reads thes configuration files for
sure.



∞
Shashwat Shriparv

Re: Get Hadoop cluster topology

Posted by shashwat shriparv <dw...@gmail.com>.
On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma
<di...@gmail.com>wrote:

> uster topology or uses an API to build it.


If you stop and start the cluster Hadoop Reads thes configuration files for
sure.



∞
Shashwat Shriparv

Re: Get Hadoop cluster topology

Posted by shashwat shriparv <dw...@gmail.com>.
On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma
<di...@gmail.com>wrote:

> uster topology or uses an API to build it.


If you stop and start the cluster Hadoop Reads thes configuration files for
sure.



∞
Shashwat Shriparv