You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by 尉雁磊 <yu...@qianxin.com> on 2022/11/10 08:22:40 UTC

hdfs dfsadmin -printTopology The target of the information may be abnormal

hdfs dfsadmin  -printTopology Always get information from this namenode in the cluster,Whether the namenode is active or standby,I don't think this is normal, this command should always get information from the active namenode!

Re: hdfs dfsadmin -printTopology The target of the information may be abnormal

Posted by Ayush Saxena <ay...@gmail.com>.
What you are trying to achieve via that extra parameter can easily be done
using GenericOptions, use the -fs and specify the namenode and port for
which you want to get the results[1]
check the overview [2] here to see how to use them.

the second point doesn't make sense, fetch from all return result for one
and log others, that isn't something doable

-Ayush

[1]
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CommandsManual.html#:~:text=Generic%20Options,-Many%20subcommands%20honor&text=Use%20value%20for%20given%20property.&text=Specify%20comma%20separated%20files%20to,Applies%20only%20to%20job.&text=Specify%20default%20filesystem%20URL%20to%20use
.

[2]
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html



On Thu, 10 Nov 2022 at 16:31, 尉雁磊 <tr...@163.com> wrote:

>
>
> I agree with you, and I wonder if there is anything that can be done to
> help managers look at possible problems in this area
>
> I have two ideas:
>
> 1.  Add a namenodeIp parameter to hdfs dfsadmin-printTopology to obtain
> rack information about the specified namenode.
>
> 2.  Add debug information to the printTopology method of class DFSAdmin.
> However, the command only requests a fixed namenode, and the debug logs of
> the other namenode cannot be printed
>
>
>
>
> At 2022-11-10 18:44:19, "Ayush Saxena" <ay...@gmail.com> wrote:
>
> If some sort of debugging is going on which doubts topological
> misconfiguration, you anyway need to check all the namenodes, if one
> namenode is misconfigured and if another is not. Maybe the issue won't
> surface if the properly configured namenode is the Active namenode at that
> time, but one failover can screw things up.
>
> Secondly, checking the topology to triage a potential issue which doubts
> rack misconfiguration just by checking Active namenode itself isn't a
> complete solution, what if when the issue occurred the present standby
> namenode was active then. In such cases anyway you have to check all the
> Namenodes.
>
> Getting Topology from Individual Namenodes is a doable task for any Admin
> & isn't as such difficult. If that wasn't naive to do so, We could have
> explored getting Topology from all the namenodes as part of DebugAdmin
> commands maybe....
>
> -Ayush
>
>
> On Thu, 10 Nov 2022 at 15:45, 尉雁磊 <tr...@163.com> wrote:
>
>>
>>
>> So what you are saying is that this is a management issue, not a code
>> issue.  Even if the manager has misdeployed the rack perception of namnode,
>> the manager will not be able to locate the actual problem from the log and
>> will only be able to check whether the deployment operation is correct。
>>
>>
>>
>>
>> At 2022-11-10 17:34:37, "Ayush Saxena" <ay...@gmail.com> wrote:
>>
>> In a stable cluster, usually all the datanodes report to all the
>> namenodes and mostly the information would be more or less same in all
>> namenodes. This isn't data which goes stale you might land up in  some
>> mess, and moreover these aren't user commands but Admin commands, it is pre
>> assumed that the admin would be having idea about the system and how it
>> behaves, and there are ways to get this detail from a specific Namenode, it
>> can be done if required, even each namenode UI gives details about the
>> datanode states and so.
>>
>> From the code point of view, I don't think it is a good idea to change or
>> something which is gonna get accepted.
>>
>> -Ayush
>>
>> On Thu, 10 Nov 2022 at 13:53, 尉雁磊 <yu...@qianxin.com> wrote:
>>
>>> hdfs dfsadmin  -printTopology Always get information from this namenode
>>> in the cluster,Whether the namenode is active or standby,I don't think
>>> this is normal, this command should always get information from the active
>>> namenode!
>>>
>>

Re: hdfs dfsadmin -printTopology The target of the information may be abnormal

Posted by 尉雁磊 <tr...@163.com>.





I agree with you, and I wonder if there is anything that can be done to help managers look at possible problems in this area 

I have two ideas: 

1.  Add a namenodeIp parameter to hdfs dfsadmin-printTopology to obtain rack information about the specified namenode. 

2.  Add debug information to the printTopology method of class DFSAdmin.  However, the command only requests a fixed namenode, and the debug logs of the other namenode cannot be printed










At 2022-11-10 18:44:19, "Ayush Saxena" <ay...@gmail.com> wrote:

If some sort of debugging is going on which doubts topological misconfiguration, you anyway need to check all the namenodes, if one namenode is misconfigured and if another is not. Maybe the issue won't surface if the properly configured namenode is the Active namenode at that time, but one failover can screw things up.


Secondly, checking the topology to triage a potential issue which doubts rack misconfiguration just by checking Active namenode itself isn't a complete solution, what if when the issue occurred the present standby namenode was active then. In such cases anyway you have to check all the Namenodes. 


Getting Topology from Individual Namenodes is a doable task for any Admin & isn't as such difficult. If that wasn't naive to do so, We could have explored getting Topology from all the namenodes as part of DebugAdmin commands maybe....


-Ayush




On Thu, 10 Nov 2022 at 15:45, 尉雁磊 <tr...@163.com> wrote:








So what you are saying is that this is a management issue, not a code issue.  Even if the manager has misdeployed the rack perception of namnode, the manager will not be able to locate the actual problem from the log and will only be able to check whether the deployment operation is correct。










At 2022-11-10 17:34:37, "Ayush Saxena" <ay...@gmail.com> wrote:

In a stable cluster, usually all the datanodes report to all the namenodes and mostly the information would be more or less same in all namenodes. This isn't data which goes stale you might land up in  some mess, and moreover these aren't user commands but Admin commands, it is pre assumed that the admin would be having idea about the system and how it behaves, and there are ways to get this detail from a specific Namenode, it can be done if required, even each namenode UI gives details about the datanode states and so.


From the code point of view, I don't think it is a good idea to change or something which is gonna get accepted.


-Ayush


On Thu, 10 Nov 2022 at 13:53, 尉雁磊 <yu...@qianxin.com> wrote:


hdfs dfsadmin  -printTopology Always get information from this namenode in the cluster,Whether the namenode is active or standby,I don't think this is normal, this command should always get information from the active namenode!

Re: Re: hdfs dfsadmin -printTopology The target of the information may be abnormal

Posted by Ayush Saxena <ay...@gmail.com>.
If some sort of debugging is going on which doubts topological
misconfiguration, you anyway need to check all the namenodes, if one
namenode is misconfigured and if another is not. Maybe the issue won't
surface if the properly configured namenode is the Active namenode at that
time, but one failover can screw things up.

Secondly, checking the topology to triage a potential issue which doubts
rack misconfiguration just by checking Active namenode itself isn't a
complete solution, what if when the issue occurred the present standby
namenode was active then. In such cases anyway you have to check all the
Namenodes.

Getting Topology from Individual Namenodes is a doable task for any Admin &
isn't as such difficult. If that wasn't naive to do so, We could have
explored getting Topology from all the namenodes as part of DebugAdmin
commands maybe....

-Ayush


On Thu, 10 Nov 2022 at 15:45, 尉雁磊 <tr...@163.com> wrote:

>
>
> So what you are saying is that this is a management issue, not a code
> issue.  Even if the manager has misdeployed the rack perception of namnode,
> the manager will not be able to locate the actual problem from the log and
> will only be able to check whether the deployment operation is correct。
>
>
>
>
> At 2022-11-10 17:34:37, "Ayush Saxena" <ay...@gmail.com> wrote:
>
> In a stable cluster, usually all the datanodes report to all the namenodes
> and mostly the information would be more or less same in all namenodes.
> This isn't data which goes stale you might land up in  some mess, and
> moreover these aren't user commands but Admin commands, it is pre assumed
> that the admin would be having idea about the system and how it behaves,
> and there are ways to get this detail from a specific Namenode, it can be
> done if required, even each namenode UI gives details about the datanode
> states and so.
>
> From the code point of view, I don't think it is a good idea to change or
> something which is gonna get accepted.
>
> -Ayush
>
> On Thu, 10 Nov 2022 at 13:53, 尉雁磊 <yu...@qianxin.com> wrote:
>
>> hdfs dfsadmin  -printTopology Always get information from this namenode
>> in the cluster,Whether the namenode is active or standby,I don't think
>> this is normal, this command should always get information from the active
>> namenode!
>>
>

Re:Re: hdfs dfsadmin -printTopology The target of the information may be abnormal

Posted by 尉雁磊 <tr...@163.com>.





So what you are saying is that this is a management issue, not a code issue.  Even if the manager has misdeployed the rack perception of namnode, the manager will not be able to locate the actual problem from the log and will only be able to check whether the deployment operation is correct。










At 2022-11-10 17:34:37, "Ayush Saxena" <ay...@gmail.com> wrote:

In a stable cluster, usually all the datanodes report to all the namenodes and mostly the information would be more or less same in all namenodes. This isn't data which goes stale you might land up in  some mess, and moreover these aren't user commands but Admin commands, it is pre assumed that the admin would be having idea about the system and how it behaves, and there are ways to get this detail from a specific Namenode, it can be done if required, even each namenode UI gives details about the datanode states and so.


From the code point of view, I don't think it is a good idea to change or something which is gonna get accepted.


-Ayush


On Thu, 10 Nov 2022 at 13:53, 尉雁磊 <yu...@qianxin.com> wrote:


hdfs dfsadmin  -printTopology Always get information from this namenode in the cluster,Whether the namenode is active or standby,I don't think this is normal, this command should always get information from the active namenode!

Re: hdfs dfsadmin -printTopology The target of the information may be abnormal

Posted by Ayush Saxena <ay...@gmail.com>.
In a stable cluster, usually all the datanodes report to all the namenodes
and mostly the information would be more or less same in all namenodes.
This isn't data which goes stale you might land up in  some mess, and
moreover these aren't user commands but Admin commands, it is pre assumed
that the admin would be having idea about the system and how it behaves,
and there are ways to get this detail from a specific Namenode, it can be
done if required, even each namenode UI gives details about the datanode
states and so.

From the code point of view, I don't think it is a good idea to change or
something which is gonna get accepted.

-Ayush

On Thu, 10 Nov 2022 at 13:53, 尉雁磊 <yu...@qianxin.com> wrote:

> hdfs dfsadmin  -printTopology Always get information from this namenode
> in the cluster,Whether the namenode is active or standby,I don't think
> this is normal, this command should always get information from the active
> namenode!
>