You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Jameson Li <ho...@gmail.com> on 2012/09/13 05:03:33 UTC

rack topology data update

Our hadoop version is hadoop-0.20-append+4.

We have configured the rack awareness in the namenode.
But when I add new datanode, and update the topology data file, and restart
the datanode, I just see the log in the namenode that:
2012-09-13 10:35:25,074 INFO org.apache.hadoop.net.NetworkTopology: Adding
a new node: /default-rack/ipc:50010
So should I restart the namenode?
Is there some command like 'hadoop dfsadmin -refreshtopology'?

My configuration:

*core-site.xml:*
<property>
<name>topology.script.file.name</name>
<value>conf/rack-awareness.sh</value>
</property>
<property>
<name>topology.script.number.args</name>
<value>1000</value>
</property>

*conf/rack-awareness.sh:*
#!/bin/sh

HADOOP_CONF=/opt/hadoop/conf

while [ $# -gt 0 ] ; do
  nodeArg=$1
  exec< ${HADOOP_CONF}/topology.data
  result=""
  while read line ; do
    ar=( $line )
    if [ "${ar[0]}" = "$nodeArg" ] ; then
      result="${ar[1]}"
    fi
  done
  shift
  if [ -z "$result" ] ; then
    echo -n "/default-rack "
  else
    echo -n "$result "
  fi
done

topology.data:
ipa rackA
ipb rackA
ipc rackB


And also I have search the mailing list "topology.script.file update":
I found a mail that:
Tom Hall 2011-10-27, 16:07
I was hoping that if I updated the file it would give new answers as
datanodes were restarted and reconnected but that does not seem to be
the case.
Surely I dont need to restart the namenode...

But there is not replying.
So somebody can help me?


专注于Mysql,MSSQL,Oracle,Hadoop

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Jameson,

Yes unfortunately this is the current case.

On Fri, Sep 14, 2012 at 8:58 AM, Jameson Li <ho...@gmail.com> wrote:
> Harsh J,
>
> If a new datanode has joined the cluster, and has a default rack info cached
> in the namenode, it will no way to change its cache other than restart the
> namenode.
> Am I right?
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> Hey Steve,
>>
>> True about the decisions part, that still needs the ugly fixing of
>> re-replication.
>>
>> I think I saw it on a JIRA by Patrick A. that was trying to change the
>> way we did this. But placement is also pluggable in 2.x right? Let me
>> find that JIRA (but yeah, am unsure if it was committed).
>>
>> On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com>
>> wrote:
>> >
>> >
>> > On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >>
>> >> This should be fixed in one of the 2.x releases, where we also refresh
>> >> the cached values.
>> >>
>> >
>> > Really? Which JIRA?
>> >
>> > I've been making changes to the topology logic so you can do some
>> > preflight
>> > checking and dump the topologies, but didn't think a clear and reload
>> > was in
>> > there. Some decisions on block placement strategy (flat vs hierarchical)
>> > are
>> > made early on, so going from flat to multi-switch is not something I'd
>> > recommend.
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Jameson,

Yes unfortunately this is the current case.

On Fri, Sep 14, 2012 at 8:58 AM, Jameson Li <ho...@gmail.com> wrote:
> Harsh J,
>
> If a new datanode has joined the cluster, and has a default rack info cached
> in the namenode, it will no way to change its cache other than restart the
> namenode.
> Am I right?
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> Hey Steve,
>>
>> True about the decisions part, that still needs the ugly fixing of
>> re-replication.
>>
>> I think I saw it on a JIRA by Patrick A. that was trying to change the
>> way we did this. But placement is also pluggable in 2.x right? Let me
>> find that JIRA (but yeah, am unsure if it was committed).
>>
>> On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com>
>> wrote:
>> >
>> >
>> > On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >>
>> >> This should be fixed in one of the 2.x releases, where we also refresh
>> >> the cached values.
>> >>
>> >
>> > Really? Which JIRA?
>> >
>> > I've been making changes to the topology logic so you can do some
>> > preflight
>> > checking and dump the topologies, but didn't think a clear and reload
>> > was in
>> > there. Some decisions on block placement strategy (flat vs hierarchical)
>> > are
>> > made early on, so going from flat to multi-switch is not something I'd
>> > recommend.
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Jameson,

Yes unfortunately this is the current case.

On Fri, Sep 14, 2012 at 8:58 AM, Jameson Li <ho...@gmail.com> wrote:
> Harsh J,
>
> If a new datanode has joined the cluster, and has a default rack info cached
> in the namenode, it will no way to change its cache other than restart the
> namenode.
> Am I right?
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> Hey Steve,
>>
>> True about the decisions part, that still needs the ugly fixing of
>> re-replication.
>>
>> I think I saw it on a JIRA by Patrick A. that was trying to change the
>> way we did this. But placement is also pluggable in 2.x right? Let me
>> find that JIRA (but yeah, am unsure if it was committed).
>>
>> On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com>
>> wrote:
>> >
>> >
>> > On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >>
>> >> This should be fixed in one of the 2.x releases, where we also refresh
>> >> the cached values.
>> >>
>> >
>> > Really? Which JIRA?
>> >
>> > I've been making changes to the topology logic so you can do some
>> > preflight
>> > checking and dump the topologies, but didn't think a clear and reload
>> > was in
>> > there. Some decisions on block placement strategy (flat vs hierarchical)
>> > are
>> > made early on, so going from flat to multi-switch is not something I'd
>> > recommend.
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Jameson,

Yes unfortunately this is the current case.

On Fri, Sep 14, 2012 at 8:58 AM, Jameson Li <ho...@gmail.com> wrote:
> Harsh J,
>
> If a new datanode has joined the cluster, and has a default rack info cached
> in the namenode, it will no way to change its cache other than restart the
> namenode.
> Am I right?
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> Hey Steve,
>>
>> True about the decisions part, that still needs the ugly fixing of
>> re-replication.
>>
>> I think I saw it on a JIRA by Patrick A. that was trying to change the
>> way we did this. But placement is also pluggable in 2.x right? Let me
>> find that JIRA (but yeah, am unsure if it was committed).
>>
>> On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com>
>> wrote:
>> >
>> >
>> > On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >>
>> >> This should be fixed in one of the 2.x releases, where we also refresh
>> >> the cached values.
>> >>
>> >
>> > Really? Which JIRA?
>> >
>> > I've been making changes to the topology logic so you can do some
>> > preflight
>> > checking and dump the topologies, but didn't think a clear and reload
>> > was in
>> > there. Some decisions on block placement strategy (flat vs hierarchical)
>> > are
>> > made early on, so going from flat to multi-switch is not something I'd
>> > recommend.
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Harsh J,

If a new datanode has joined the cluster, and has a default rack info
cached in the namenode, it will no way to change its cache other than
restart the namenode.
Am I right?

专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Harsh J <ha...@cloudera.com>

> Hey Steve,
>
> True about the decisions part, that still needs the ugly fixing of
> re-replication.
>
> I think I saw it on a JIRA by Patrick A. that was trying to change the
> way we did this. But placement is also pluggable in 2.x right? Let me
> find that JIRA (but yeah, am unsure if it was committed).
>
> On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com>
> wrote:
> >
> >
> > On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
> >>
> >>
> >> This should be fixed in one of the 2.x releases, where we also refresh
> >> the cached values.
> >>
> >
> > Really? Which JIRA?
> >
> > I've been making changes to the topology logic so you can do some
> preflight
> > checking and dump the topologies, but didn't think a clear and reload
> was in
> > there. Some decisions on block placement strategy (flat vs hierarchical)
> are
> > made early on, so going from flat to multi-switch is not something I'd
> > recommend.
>
>
>
> --
> Harsh J
>

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Harsh J,

If a new datanode has joined the cluster, and has a default rack info
cached in the namenode, it will no way to change its cache other than
restart the namenode.
Am I right?

专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Harsh J <ha...@cloudera.com>

> Hey Steve,
>
> True about the decisions part, that still needs the ugly fixing of
> re-replication.
>
> I think I saw it on a JIRA by Patrick A. that was trying to change the
> way we did this. But placement is also pluggable in 2.x right? Let me
> find that JIRA (but yeah, am unsure if it was committed).
>
> On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com>
> wrote:
> >
> >
> > On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
> >>
> >>
> >> This should be fixed in one of the 2.x releases, where we also refresh
> >> the cached values.
> >>
> >
> > Really? Which JIRA?
> >
> > I've been making changes to the topology logic so you can do some
> preflight
> > checking and dump the topologies, but didn't think a clear and reload
> was in
> > there. Some decisions on block placement strategy (flat vs hierarchical)
> are
> > made early on, so going from flat to multi-switch is not something I'd
> > recommend.
>
>
>
> --
> Harsh J
>

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Harsh J,

If a new datanode has joined the cluster, and has a default rack info
cached in the namenode, it will no way to change its cache other than
restart the namenode.
Am I right?

专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Harsh J <ha...@cloudera.com>

> Hey Steve,
>
> True about the decisions part, that still needs the ugly fixing of
> re-replication.
>
> I think I saw it on a JIRA by Patrick A. that was trying to change the
> way we did this. But placement is also pluggable in 2.x right? Let me
> find that JIRA (but yeah, am unsure if it was committed).
>
> On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com>
> wrote:
> >
> >
> > On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
> >>
> >>
> >> This should be fixed in one of the 2.x releases, where we also refresh
> >> the cached values.
> >>
> >
> > Really? Which JIRA?
> >
> > I've been making changes to the topology logic so you can do some
> preflight
> > checking and dump the topologies, but didn't think a clear and reload
> was in
> > there. Some decisions on block placement strategy (flat vs hierarchical)
> are
> > made early on, so going from flat to multi-switch is not something I'd
> > recommend.
>
>
>
> --
> Harsh J
>

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Harsh J,

If a new datanode has joined the cluster, and has a default rack info
cached in the namenode, it will no way to change its cache other than
restart the namenode.
Am I right?

专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Harsh J <ha...@cloudera.com>

> Hey Steve,
>
> True about the decisions part, that still needs the ugly fixing of
> re-replication.
>
> I think I saw it on a JIRA by Patrick A. that was trying to change the
> way we did this. But placement is also pluggable in 2.x right? Let me
> find that JIRA (but yeah, am unsure if it was committed).
>
> On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com>
> wrote:
> >
> >
> > On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
> >>
> >>
> >> This should be fixed in one of the 2.x releases, where we also refresh
> >> the cached values.
> >>
> >
> > Really? Which JIRA?
> >
> > I've been making changes to the topology logic so you can do some
> preflight
> > checking and dump the topologies, but didn't think a clear and reload
> was in
> > there. Some decisions on block placement strategy (flat vs hierarchical)
> are
> > made early on, so going from flat to multi-switch is not something I'd
> > recommend.
>
>
>
> --
> Harsh J
>

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Hey Steve,

True about the decisions part, that still needs the ugly fixing of
re-replication.

I think I saw it on a JIRA by Patrick A. that was trying to change the
way we did this. But placement is also pluggable in 2.x right? Let me
find that JIRA (but yeah, am unsure if it was committed).

On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com> wrote:
>
>
> On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
>>
>>
>> This should be fixed in one of the 2.x releases, where we also refresh
>> the cached values.
>>
>
> Really? Which JIRA?
>
> I've been making changes to the topology logic so you can do some preflight
> checking and dump the topologies, but didn't think a clear and reload was in
> there. Some decisions on block placement strategy (flat vs hierarchical) are
> made early on, so going from flat to multi-switch is not something I'd
> recommend.



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Hey Steve,

True about the decisions part, that still needs the ugly fixing of
re-replication.

I think I saw it on a JIRA by Patrick A. that was trying to change the
way we did this. But placement is also pluggable in 2.x right? Let me
find that JIRA (but yeah, am unsure if it was committed).

On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com> wrote:
>
>
> On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
>>
>>
>> This should be fixed in one of the 2.x releases, where we also refresh
>> the cached values.
>>
>
> Really? Which JIRA?
>
> I've been making changes to the topology logic so you can do some preflight
> checking and dump the topologies, but didn't think a clear and reload was in
> there. Some decisions on block placement strategy (flat vs hierarchical) are
> made early on, so going from flat to multi-switch is not something I'd
> recommend.



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Hey Steve,

True about the decisions part, that still needs the ugly fixing of
re-replication.

I think I saw it on a JIRA by Patrick A. that was trying to change the
way we did this. But placement is also pluggable in 2.x right? Let me
find that JIRA (but yeah, am unsure if it was committed).

On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com> wrote:
>
>
> On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
>>
>>
>> This should be fixed in one of the 2.x releases, where we also refresh
>> the cached values.
>>
>
> Really? Which JIRA?
>
> I've been making changes to the topology logic so you can do some preflight
> checking and dump the topologies, but didn't think a clear and reload was in
> there. Some decisions on block placement strategy (flat vs hierarchical) are
> made early on, so going from flat to multi-switch is not something I'd
> recommend.



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Hey Steve,

True about the decisions part, that still needs the ugly fixing of
re-replication.

I think I saw it on a JIRA by Patrick A. that was trying to change the
way we did this. But placement is also pluggable in 2.x right? Let me
find that JIRA (but yeah, am unsure if it was committed).

On Thu, Sep 13, 2012 at 2:40 PM, Steve Loughran <st...@hortonworks.com> wrote:
>
>
> On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:
>>
>>
>> This should be fixed in one of the 2.x releases, where we also refresh
>> the cached values.
>>
>
> Really? Which JIRA?
>
> I've been making changes to the topology logic so you can do some preflight
> checking and dump the topologies, but didn't think a clear and reload was in
> there. Some decisions on block placement strategy (flat vs hierarchical) are
> made early on, so going from flat to multi-switch is not something I'd
> recommend.



-- 
Harsh J

Re: rack topology data update

Posted by Steve Loughran <st...@hortonworks.com>.
On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:

>
> This should be fixed in one of the 2.x releases, where we also refresh
> the cached values.
>
>
Really? Which JIRA?

I've been making changes to the topology logic so you can do some preflight
checking and dump the topologies, but didn't think a clear and reload was
in there. Some decisions on block placement strategy (flat vs hierarchical)
are made early on, so going from flat to multi-switch is not something I'd
recommend.

Re: rack topology data update

Posted by Steve Loughran <st...@hortonworks.com>.
On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:

>
> This should be fixed in one of the 2.x releases, where we also refresh
> the cached values.
>
>
Really? Which JIRA?

I've been making changes to the topology logic so you can do some preflight
checking and dump the topologies, but didn't think a clear and reload was
in there. Some decisions on block placement strategy (flat vs hierarchical)
are made early on, so going from flat to multi-switch is not something I'd
recommend.

Re: rack topology data update

Posted by Steve Loughran <st...@hortonworks.com>.
On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:

>
> This should be fixed in one of the 2.x releases, where we also refresh
> the cached values.
>
>
Really? Which JIRA?

I've been making changes to the topology logic so you can do some preflight
checking and dump the topologies, but didn't think a clear and reload was
in there. Some decisions on block placement strategy (flat vs hierarchical)
are made early on, so going from flat to multi-switch is not something I'd
recommend.

Re: rack topology data update

Posted by Steve Loughran <st...@hortonworks.com>.
On 13 September 2012 09:03, Harsh J <ha...@cloudera.com> wrote:

>
> This should be fixed in one of the 2.x releases, where we also refresh
> the cached values.
>
>
Really? Which JIRA?

I've been making changes to the topology logic so you can do some preflight
checking and dump the topologies, but didn't think a clear and reload was
in there. Some decisions on block placement strategy (flat vs hierarchical)
are made early on, so going from flat to multi-switch is not something I'd
recommend.

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Hi Jameson,

As I'd mentioned, due to the current behavior, if the NN has cached a
bad topology mapping value already, it will not forget it despite a
-refreshNodes command. Otherwise, there's no problem. That is if
you've done the following, NN may require a restart:

1. Start new DN (At this point, DN gets mapped to default rack as
there's no entry, and this is cached)
2. Update topology file, do refreshNodes

This should be fixed in one of the 2.x releases, where we also refresh
the cached values.

On Thu, Sep 13, 2012 at 12:49 PM, Jameson Li <ho...@gmail.com> wrote:
> Hi  Harsh J,  Viji R,  Saurabh bhutyani,
>
> Thanks for all of yours replying.
>
> But really the namenode not refresh the rack info.
>
> Is my hadoop version issue? My hadoop version is base on hadoop-0.20-append,
> and have 4 patches on it that I think they are really no matter with the
> rack awareness.
>
> If anytime I add new nodes, and I should restart namenode to refresh the
> rack info, I will crazy...
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Saurabh bhutyani <s4...@gmail.com>
>>
>> I believe running the following command on namenode should refresh it.
>>
>> 'hadoop dfsadmin -refreshNodes'
>>
>> Thanks & Regards,
>> Saurabh Bhutyani
>>
>> Call  : 9820083104
>> Gtalk: s4saurabh@gmail.com
>>
>>
>>
>>
>> On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:
>>>
>>> Hi Jameson,
>>>
>>> If the NameNode has cached the wrong value earlier, it will not
>>> refresh that until you restart it.
>>>
>>> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
>>> > Hi harsh,
>>> >
>>> > I have followed your suggestion operation.
>>> >
>>> > 1, stop the new datanode.(I have modified the topology file in the
>>> > namenode
>>> > before.)
>>> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
>>> > 3, start the new datanode.
>>> >
>>> > But it really not update the new topology mapping.
>>> > It just show the start info in the namenode that:
>>> > "
>>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>>> > Removing
>>> > a node: /default-rack/10.0.10.100:50010
>>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>>> > Adding a
>>> > new node: /default-rack/10.0.10.100:50010
>>> > "
>>> >
>>> >
>>> > 专注于Mysql,MSSQL,Oracle,Hadoop
>>> >
>>> >
>>> >
>>> > 2012/9/13 Harsh J <ha...@cloudera.com>
>>> >>
>>> >> the DN only after (2) so it picks up the right mapping an
>>> >
>>> >
>>
>>
>



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Hi Jameson,

As I'd mentioned, due to the current behavior, if the NN has cached a
bad topology mapping value already, it will not forget it despite a
-refreshNodes command. Otherwise, there's no problem. That is if
you've done the following, NN may require a restart:

1. Start new DN (At this point, DN gets mapped to default rack as
there's no entry, and this is cached)
2. Update topology file, do refreshNodes

This should be fixed in one of the 2.x releases, where we also refresh
the cached values.

On Thu, Sep 13, 2012 at 12:49 PM, Jameson Li <ho...@gmail.com> wrote:
> Hi  Harsh J,  Viji R,  Saurabh bhutyani,
>
> Thanks for all of yours replying.
>
> But really the namenode not refresh the rack info.
>
> Is my hadoop version issue? My hadoop version is base on hadoop-0.20-append,
> and have 4 patches on it that I think they are really no matter with the
> rack awareness.
>
> If anytime I add new nodes, and I should restart namenode to refresh the
> rack info, I will crazy...
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Saurabh bhutyani <s4...@gmail.com>
>>
>> I believe running the following command on namenode should refresh it.
>>
>> 'hadoop dfsadmin -refreshNodes'
>>
>> Thanks & Regards,
>> Saurabh Bhutyani
>>
>> Call  : 9820083104
>> Gtalk: s4saurabh@gmail.com
>>
>>
>>
>>
>> On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:
>>>
>>> Hi Jameson,
>>>
>>> If the NameNode has cached the wrong value earlier, it will not
>>> refresh that until you restart it.
>>>
>>> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
>>> > Hi harsh,
>>> >
>>> > I have followed your suggestion operation.
>>> >
>>> > 1, stop the new datanode.(I have modified the topology file in the
>>> > namenode
>>> > before.)
>>> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
>>> > 3, start the new datanode.
>>> >
>>> > But it really not update the new topology mapping.
>>> > It just show the start info in the namenode that:
>>> > "
>>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>>> > Removing
>>> > a node: /default-rack/10.0.10.100:50010
>>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>>> > Adding a
>>> > new node: /default-rack/10.0.10.100:50010
>>> > "
>>> >
>>> >
>>> > 专注于Mysql,MSSQL,Oracle,Hadoop
>>> >
>>> >
>>> >
>>> > 2012/9/13 Harsh J <ha...@cloudera.com>
>>> >>
>>> >> the DN only after (2) so it picks up the right mapping an
>>> >
>>> >
>>
>>
>



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Hi Jameson,

As I'd mentioned, due to the current behavior, if the NN has cached a
bad topology mapping value already, it will not forget it despite a
-refreshNodes command. Otherwise, there's no problem. That is if
you've done the following, NN may require a restart:

1. Start new DN (At this point, DN gets mapped to default rack as
there's no entry, and this is cached)
2. Update topology file, do refreshNodes

This should be fixed in one of the 2.x releases, where we also refresh
the cached values.

On Thu, Sep 13, 2012 at 12:49 PM, Jameson Li <ho...@gmail.com> wrote:
> Hi  Harsh J,  Viji R,  Saurabh bhutyani,
>
> Thanks for all of yours replying.
>
> But really the namenode not refresh the rack info.
>
> Is my hadoop version issue? My hadoop version is base on hadoop-0.20-append,
> and have 4 patches on it that I think they are really no matter with the
> rack awareness.
>
> If anytime I add new nodes, and I should restart namenode to refresh the
> rack info, I will crazy...
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Saurabh bhutyani <s4...@gmail.com>
>>
>> I believe running the following command on namenode should refresh it.
>>
>> 'hadoop dfsadmin -refreshNodes'
>>
>> Thanks & Regards,
>> Saurabh Bhutyani
>>
>> Call  : 9820083104
>> Gtalk: s4saurabh@gmail.com
>>
>>
>>
>>
>> On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:
>>>
>>> Hi Jameson,
>>>
>>> If the NameNode has cached the wrong value earlier, it will not
>>> refresh that until you restart it.
>>>
>>> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
>>> > Hi harsh,
>>> >
>>> > I have followed your suggestion operation.
>>> >
>>> > 1, stop the new datanode.(I have modified the topology file in the
>>> > namenode
>>> > before.)
>>> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
>>> > 3, start the new datanode.
>>> >
>>> > But it really not update the new topology mapping.
>>> > It just show the start info in the namenode that:
>>> > "
>>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>>> > Removing
>>> > a node: /default-rack/10.0.10.100:50010
>>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>>> > Adding a
>>> > new node: /default-rack/10.0.10.100:50010
>>> > "
>>> >
>>> >
>>> > 专注于Mysql,MSSQL,Oracle,Hadoop
>>> >
>>> >
>>> >
>>> > 2012/9/13 Harsh J <ha...@cloudera.com>
>>> >>
>>> >> the DN only after (2) so it picks up the right mapping an
>>> >
>>> >
>>
>>
>



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Hi Jameson,

As I'd mentioned, due to the current behavior, if the NN has cached a
bad topology mapping value already, it will not forget it despite a
-refreshNodes command. Otherwise, there's no problem. That is if
you've done the following, NN may require a restart:

1. Start new DN (At this point, DN gets mapped to default rack as
there's no entry, and this is cached)
2. Update topology file, do refreshNodes

This should be fixed in one of the 2.x releases, where we also refresh
the cached values.

On Thu, Sep 13, 2012 at 12:49 PM, Jameson Li <ho...@gmail.com> wrote:
> Hi  Harsh J,  Viji R,  Saurabh bhutyani,
>
> Thanks for all of yours replying.
>
> But really the namenode not refresh the rack info.
>
> Is my hadoop version issue? My hadoop version is base on hadoop-0.20-append,
> and have 4 patches on it that I think they are really no matter with the
> rack awareness.
>
> If anytime I add new nodes, and I should restart namenode to refresh the
> rack info, I will crazy...
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Saurabh bhutyani <s4...@gmail.com>
>>
>> I believe running the following command on namenode should refresh it.
>>
>> 'hadoop dfsadmin -refreshNodes'
>>
>> Thanks & Regards,
>> Saurabh Bhutyani
>>
>> Call  : 9820083104
>> Gtalk: s4saurabh@gmail.com
>>
>>
>>
>>
>> On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:
>>>
>>> Hi Jameson,
>>>
>>> If the NameNode has cached the wrong value earlier, it will not
>>> refresh that until you restart it.
>>>
>>> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
>>> > Hi harsh,
>>> >
>>> > I have followed your suggestion operation.
>>> >
>>> > 1, stop the new datanode.(I have modified the topology file in the
>>> > namenode
>>> > before.)
>>> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
>>> > 3, start the new datanode.
>>> >
>>> > But it really not update the new topology mapping.
>>> > It just show the start info in the namenode that:
>>> > "
>>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>>> > Removing
>>> > a node: /default-rack/10.0.10.100:50010
>>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>>> > Adding a
>>> > new node: /default-rack/10.0.10.100:50010
>>> > "
>>> >
>>> >
>>> > 专注于Mysql,MSSQL,Oracle,Hadoop
>>> >
>>> >
>>> >
>>> > 2012/9/13 Harsh J <ha...@cloudera.com>
>>> >>
>>> >> the DN only after (2) so it picks up the right mapping an
>>> >
>>> >
>>
>>
>



-- 
Harsh J

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Hi  Harsh J,  Viji R,  Saurabh bhutyani,

Thanks for all of yours replying.

But really the namenode not refresh the rack info.

Is my hadoop version issue? My hadoop version is base on
hadoop-0.20-append, and have 4 patches on it that I think they are really
no matter with the rack awareness.

If anytime I add new nodes, and I should restart namenode to refresh the
rack info, I will crazy...

专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Saurabh bhutyani <s4...@gmail.com>

> I believe running the following command on namenode should refresh it.
>
> 'hadoop dfsadmin -refreshNodes'
>
> Thanks & Regards,
> Saurabh Bhutyani
>
> Call  : 9820083104
> Gtalk: s4saurabh@gmail.com
>
>
>
>
> On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:
>
>> Hi Jameson,
>>
>> If the NameNode has cached the wrong value earlier, it will not
>> refresh that until you restart it.
>>
>> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
>> > Hi harsh,
>> >
>> > I have followed your suggestion operation.
>> >
>> > 1, stop the new datanode.(I have modified the topology file in the
>> namenode
>> > before.)
>> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
>> > 3, start the new datanode.
>> >
>> > But it really not update the new topology mapping.
>> > It just show the start info in the namenode that:
>> > "
>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>> Removing
>> > a node: /default-rack/10.0.10.100:50010
>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>> Adding a
>> > new node: /default-rack/10.0.10.100:50010
>> > "
>> >
>> >
>> > 专注于Mysql,MSSQL,Oracle,Hadoop
>> >
>> >
>> >
>> > 2012/9/13 Harsh J <ha...@cloudera.com>
>> >>
>> >> the DN only after (2) so it picks up the right mapping an
>> >
>> >
>>
>
>

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Hi  Harsh J,  Viji R,  Saurabh bhutyani,

Thanks for all of yours replying.

But really the namenode not refresh the rack info.

Is my hadoop version issue? My hadoop version is base on
hadoop-0.20-append, and have 4 patches on it that I think they are really
no matter with the rack awareness.

If anytime I add new nodes, and I should restart namenode to refresh the
rack info, I will crazy...

专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Saurabh bhutyani <s4...@gmail.com>

> I believe running the following command on namenode should refresh it.
>
> 'hadoop dfsadmin -refreshNodes'
>
> Thanks & Regards,
> Saurabh Bhutyani
>
> Call  : 9820083104
> Gtalk: s4saurabh@gmail.com
>
>
>
>
> On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:
>
>> Hi Jameson,
>>
>> If the NameNode has cached the wrong value earlier, it will not
>> refresh that until you restart it.
>>
>> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
>> > Hi harsh,
>> >
>> > I have followed your suggestion operation.
>> >
>> > 1, stop the new datanode.(I have modified the topology file in the
>> namenode
>> > before.)
>> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
>> > 3, start the new datanode.
>> >
>> > But it really not update the new topology mapping.
>> > It just show the start info in the namenode that:
>> > "
>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>> Removing
>> > a node: /default-rack/10.0.10.100:50010
>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>> Adding a
>> > new node: /default-rack/10.0.10.100:50010
>> > "
>> >
>> >
>> > 专注于Mysql,MSSQL,Oracle,Hadoop
>> >
>> >
>> >
>> > 2012/9/13 Harsh J <ha...@cloudera.com>
>> >>
>> >> the DN only after (2) so it picks up the right mapping an
>> >
>> >
>>
>
>

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Hi  Harsh J,  Viji R,  Saurabh bhutyani,

Thanks for all of yours replying.

But really the namenode not refresh the rack info.

Is my hadoop version issue? My hadoop version is base on
hadoop-0.20-append, and have 4 patches on it that I think they are really
no matter with the rack awareness.

If anytime I add new nodes, and I should restart namenode to refresh the
rack info, I will crazy...

专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Saurabh bhutyani <s4...@gmail.com>

> I believe running the following command on namenode should refresh it.
>
> 'hadoop dfsadmin -refreshNodes'
>
> Thanks & Regards,
> Saurabh Bhutyani
>
> Call  : 9820083104
> Gtalk: s4saurabh@gmail.com
>
>
>
>
> On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:
>
>> Hi Jameson,
>>
>> If the NameNode has cached the wrong value earlier, it will not
>> refresh that until you restart it.
>>
>> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
>> > Hi harsh,
>> >
>> > I have followed your suggestion operation.
>> >
>> > 1, stop the new datanode.(I have modified the topology file in the
>> namenode
>> > before.)
>> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
>> > 3, start the new datanode.
>> >
>> > But it really not update the new topology mapping.
>> > It just show the start info in the namenode that:
>> > "
>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>> Removing
>> > a node: /default-rack/10.0.10.100:50010
>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>> Adding a
>> > new node: /default-rack/10.0.10.100:50010
>> > "
>> >
>> >
>> > 专注于Mysql,MSSQL,Oracle,Hadoop
>> >
>> >
>> >
>> > 2012/9/13 Harsh J <ha...@cloudera.com>
>> >>
>> >> the DN only after (2) so it picks up the right mapping an
>> >
>> >
>>
>
>

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Hi  Harsh J,  Viji R,  Saurabh bhutyani,

Thanks for all of yours replying.

But really the namenode not refresh the rack info.

Is my hadoop version issue? My hadoop version is base on
hadoop-0.20-append, and have 4 patches on it that I think they are really
no matter with the rack awareness.

If anytime I add new nodes, and I should restart namenode to refresh the
rack info, I will crazy...

专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Saurabh bhutyani <s4...@gmail.com>

> I believe running the following command on namenode should refresh it.
>
> 'hadoop dfsadmin -refreshNodes'
>
> Thanks & Regards,
> Saurabh Bhutyani
>
> Call  : 9820083104
> Gtalk: s4saurabh@gmail.com
>
>
>
>
> On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:
>
>> Hi Jameson,
>>
>> If the NameNode has cached the wrong value earlier, it will not
>> refresh that until you restart it.
>>
>> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
>> > Hi harsh,
>> >
>> > I have followed your suggestion operation.
>> >
>> > 1, stop the new datanode.(I have modified the topology file in the
>> namenode
>> > before.)
>> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
>> > 3, start the new datanode.
>> >
>> > But it really not update the new topology mapping.
>> > It just show the start info in the namenode that:
>> > "
>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>> Removing
>> > a node: /default-rack/10.0.10.100:50010
>> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
>> Adding a
>> > new node: /default-rack/10.0.10.100:50010
>> > "
>> >
>> >
>> > 专注于Mysql,MSSQL,Oracle,Hadoop
>> >
>> >
>> >
>> > 2012/9/13 Harsh J <ha...@cloudera.com>
>> >>
>> >> the DN only after (2) so it picks up the right mapping an
>> >
>> >
>>
>
>

Re: rack topology data update

Posted by Saurabh bhutyani <s4...@gmail.com>.
I believe running the following command on namenode should refresh it.

'hadoop dfsadmin -refreshNodes'

Thanks & Regards,
Saurabh Bhutyani

Call  : 9820083104
Gtalk: s4saurabh@gmail.com



On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:

> Hi Jameson,
>
> If the NameNode has cached the wrong value earlier, it will not
> refresh that until you restart it.
>
> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
> > Hi harsh,
> >
> > I have followed your suggestion operation.
> >
> > 1, stop the new datanode.(I have modified the topology file in the
> namenode
> > before.)
> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
> > 3, start the new datanode.
> >
> > But it really not update the new topology mapping.
> > It just show the start info in the namenode that:
> > "
> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
> Removing
> > a node: /default-rack/10.0.10.100:50010
> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
> Adding a
> > new node: /default-rack/10.0.10.100:50010
> > "
> >
> >
> > 专注于Mysql,MSSQL,Oracle,Hadoop
> >
> >
> >
> > 2012/9/13 Harsh J <ha...@cloudera.com>
> >>
> >> the DN only after (2) so it picks up the right mapping an
> >
> >
>

Re: rack topology data update

Posted by Saurabh bhutyani <s4...@gmail.com>.
I believe running the following command on namenode should refresh it.

'hadoop dfsadmin -refreshNodes'

Thanks & Regards,
Saurabh Bhutyani

Call  : 9820083104
Gtalk: s4saurabh@gmail.com



On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:

> Hi Jameson,
>
> If the NameNode has cached the wrong value earlier, it will not
> refresh that until you restart it.
>
> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
> > Hi harsh,
> >
> > I have followed your suggestion operation.
> >
> > 1, stop the new datanode.(I have modified the topology file in the
> namenode
> > before.)
> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
> > 3, start the new datanode.
> >
> > But it really not update the new topology mapping.
> > It just show the start info in the namenode that:
> > "
> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
> Removing
> > a node: /default-rack/10.0.10.100:50010
> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
> Adding a
> > new node: /default-rack/10.0.10.100:50010
> > "
> >
> >
> > 专注于Mysql,MSSQL,Oracle,Hadoop
> >
> >
> >
> > 2012/9/13 Harsh J <ha...@cloudera.com>
> >>
> >> the DN only after (2) so it picks up the right mapping an
> >
> >
>

Re: rack topology data update

Posted by Saurabh bhutyani <s4...@gmail.com>.
I believe running the following command on namenode should refresh it.

'hadoop dfsadmin -refreshNodes'

Thanks & Regards,
Saurabh Bhutyani

Call  : 9820083104
Gtalk: s4saurabh@gmail.com



On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:

> Hi Jameson,
>
> If the NameNode has cached the wrong value earlier, it will not
> refresh that until you restart it.
>
> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
> > Hi harsh,
> >
> > I have followed your suggestion operation.
> >
> > 1, stop the new datanode.(I have modified the topology file in the
> namenode
> > before.)
> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
> > 3, start the new datanode.
> >
> > But it really not update the new topology mapping.
> > It just show the start info in the namenode that:
> > "
> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
> Removing
> > a node: /default-rack/10.0.10.100:50010
> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
> Adding a
> > new node: /default-rack/10.0.10.100:50010
> > "
> >
> >
> > 专注于Mysql,MSSQL,Oracle,Hadoop
> >
> >
> >
> > 2012/9/13 Harsh J <ha...@cloudera.com>
> >>
> >> the DN only after (2) so it picks up the right mapping an
> >
> >
>

Re: rack topology data update

Posted by Saurabh bhutyani <s4...@gmail.com>.
I believe running the following command on namenode should refresh it.

'hadoop dfsadmin -refreshNodes'

Thanks & Regards,
Saurabh Bhutyani

Call  : 9820083104
Gtalk: s4saurabh@gmail.com



On Thu, Sep 13, 2012 at 11:25 AM, Viji R <vi...@cloudera.com> wrote:

> Hi Jameson,
>
> If the NameNode has cached the wrong value earlier, it will not
> refresh that until you restart it.
>
> On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
> > Hi harsh,
> >
> > I have followed your suggestion operation.
> >
> > 1, stop the new datanode.(I have modified the topology file in the
> namenode
> > before.)
> > 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
> > 3, start the new datanode.
> >
> > But it really not update the new topology mapping.
> > It just show the start info in the namenode that:
> > "
> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
> Removing
> > a node: /default-rack/10.0.10.100:50010
> > 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
> Adding a
> > new node: /default-rack/10.0.10.100:50010
> > "
> >
> >
> > 专注于Mysql,MSSQL,Oracle,Hadoop
> >
> >
> >
> > 2012/9/13 Harsh J <ha...@cloudera.com>
> >>
> >> the DN only after (2) so it picks up the right mapping an
> >
> >
>

Re: rack topology data update

Posted by Viji R <vi...@cloudera.com>.
Hi Jameson,

If the NameNode has cached the wrong value earlier, it will not
refresh that until you restart it.

On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
> Hi harsh,
>
> I have followed your suggestion operation.
>
> 1, stop the new datanode.(I have modified the topology file in the namenode
> before.)
> 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
> 3, start the new datanode.
>
> But it really not update the new topology mapping.
> It just show the start info in the namenode that:
> "
> 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Removing
> a node: /default-rack/10.0.10.100:50010
> 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /default-rack/10.0.10.100:50010
> "
>
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> the DN only after (2) so it picks up the right mapping an
>
>

Re: rack topology data update

Posted by Viji R <vi...@cloudera.com>.
Hi Jameson,

If the NameNode has cached the wrong value earlier, it will not
refresh that until you restart it.

On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
> Hi harsh,
>
> I have followed your suggestion operation.
>
> 1, stop the new datanode.(I have modified the topology file in the namenode
> before.)
> 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
> 3, start the new datanode.
>
> But it really not update the new topology mapping.
> It just show the start info in the namenode that:
> "
> 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Removing
> a node: /default-rack/10.0.10.100:50010
> 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /default-rack/10.0.10.100:50010
> "
>
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> the DN only after (2) so it picks up the right mapping an
>
>

Re: rack topology data update

Posted by Viji R <vi...@cloudera.com>.
Hi Jameson,

If the NameNode has cached the wrong value earlier, it will not
refresh that until you restart it.

On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
> Hi harsh,
>
> I have followed your suggestion operation.
>
> 1, stop the new datanode.(I have modified the topology file in the namenode
> before.)
> 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
> 3, start the new datanode.
>
> But it really not update the new topology mapping.
> It just show the start info in the namenode that:
> "
> 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Removing
> a node: /default-rack/10.0.10.100:50010
> 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /default-rack/10.0.10.100:50010
> "
>
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> the DN only after (2) so it picks up the right mapping an
>
>

Re: rack topology data update

Posted by Viji R <vi...@cloudera.com>.
Hi Jameson,

If the NameNode has cached the wrong value earlier, it will not
refresh that until you restart it.

On Thu, Sep 13, 2012 at 11:21 AM, Jameson Li <ho...@gmail.com> wrote:
> Hi harsh,
>
> I have followed your suggestion operation.
>
> 1, stop the new datanode.(I have modified the topology file in the namenode
> before.)
> 2, run 'hadoop dfsadmin -refreshNodes' on the namenode
> 3, start the new datanode.
>
> But it really not update the new topology mapping.
> It just show the start info in the namenode that:
> "
> 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Removing
> a node: /default-rack/10.0.10.100:50010
> 2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /default-rack/10.0.10.100:50010
> "
>
>
> 专注于Mysql,MSSQL,Oracle,Hadoop
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> the DN only after (2) so it picks up the right mapping an
>
>

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Hi harsh,

I have followed your suggestion operation.

1, stop the new datanode.(I have modified the topology file in the namenode
before.)
2, run 'hadoop dfsadmin -refreshNodes' on the namenode
3, start the new datanode.

But it really not update the new topology mapping.
It just show the start info in the namenode that:
"
2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
Removing a node: /default-rack/10.0.10.100:50010
2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Adding
a new node: /default-rack/10.0.10.100:50010
"


专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Harsh J <ha...@cloudera.com>

> the DN only after (2) so it picks up the right mapping an

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Hi harsh,

I have followed your suggestion operation.

1, stop the new datanode.(I have modified the topology file in the namenode
before.)
2, run 'hadoop dfsadmin -refreshNodes' on the namenode
3, start the new datanode.

But it really not update the new topology mapping.
It just show the start info in the namenode that:
"
2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
Removing a node: /default-rack/10.0.10.100:50010
2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Adding
a new node: /default-rack/10.0.10.100:50010
"


专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Harsh J <ha...@cloudera.com>

> the DN only after (2) so it picks up the right mapping an

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Hi harsh,

I have followed your suggestion operation.

1, stop the new datanode.(I have modified the topology file in the namenode
before.)
2, run 'hadoop dfsadmin -refreshNodes' on the namenode
3, start the new datanode.

But it really not update the new topology mapping.
It just show the start info in the namenode that:
"
2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
Removing a node: /default-rack/10.0.10.100:50010
2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Adding
a new node: /default-rack/10.0.10.100:50010
"


专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Harsh J <ha...@cloudera.com>

> the DN only after (2) so it picks up the right mapping an

Re: rack topology data update

Posted by Jameson Li <ho...@gmail.com>.
Hi harsh,

I have followed your suggestion operation.

1, stop the new datanode.(I have modified the topology file in the namenode
before.)
2, run 'hadoop dfsadmin -refreshNodes' on the namenode
3, start the new datanode.

But it really not update the new topology mapping.
It just show the start info in the namenode that:
"
2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology:
Removing a node: /default-rack/10.0.10.100:50010
2012-09-13 13:44:14,706 INFO org.apache.hadoop.net.NetworkTopology: Adding
a new node: /default-rack/10.0.10.100:50010
"


专注于Mysql,MSSQL,Oracle,Hadoop


2012/9/13 Harsh J <ha...@cloudera.com>

> the DN only after (2) so it picks up the right mapping an

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Jameson,

The right process to add a new node with the right mapping is:

1. Update topology file for the new DN.
2. Issue a dfsadmin -refreshNodes to get new topology mapping updated in NN.
3. Start the DN only after (2) so it picks up the right mapping and a
default mapping does not get cached.

On Thu, Sep 13, 2012 at 8:33 AM, Jameson Li <ho...@gmail.com> wrote:
> Our hadoop version is hadoop-0.20-append+4.
>
> We have configured the rack awareness in the namenode.
> But when I add new datanode, and update the topology data file, and restart
> the datanode, I just see the log in the namenode that:
> 2012-09-13 10:35:25,074 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /default-rack/ipc:50010
> So should I restart the namenode?
> Is there some command like 'hadoop dfsadmin -refreshtopology'?
>
> My configuration:
>
> core-site.xml:
> <property>
> <name>topology.script.file.name</name>
> <value>conf/rack-awareness.sh</value>
> </property>
> <property>
> <name>topology.script.number.args</name>
> <value>1000</value>
> </property>
>
> conf/rack-awareness.sh:
> #!/bin/sh
>
> HADOOP_CONF=/opt/hadoop/conf
>
> while [ $# -gt 0 ] ; do
>   nodeArg=$1
>   exec< ${HADOOP_CONF}/topology.data
>   result=""
>   while read line ; do
>     ar=( $line )
>     if [ "${ar[0]}" = "$nodeArg" ] ; then
>       result="${ar[1]}"
>     fi
>   done
>   shift
>   if [ -z "$result" ] ; then
>     echo -n "/default-rack "
>   else
>     echo -n "$result "
>   fi
> done
>
> topology.data:
> ipa rackA
> ipb rackA
> ipc rackB
>
>
> And also I have search the mailing list "topology.script.file update":
> I found a mail that:
> Tom Hall 2011-10-27, 16:07
> I was hoping that if I updated the file it would give new answers as
> datanodes were restarted and reconnected but that does not seem to be
> the case.
> Surely I dont need to restart the namenode...
>
> But there is not replying.
> So somebody can help me?
>
>
> 专注于Mysql,MSSQL,Oracle,Hadoop



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Jameson,

The right process to add a new node with the right mapping is:

1. Update topology file for the new DN.
2. Issue a dfsadmin -refreshNodes to get new topology mapping updated in NN.
3. Start the DN only after (2) so it picks up the right mapping and a
default mapping does not get cached.

On Thu, Sep 13, 2012 at 8:33 AM, Jameson Li <ho...@gmail.com> wrote:
> Our hadoop version is hadoop-0.20-append+4.
>
> We have configured the rack awareness in the namenode.
> But when I add new datanode, and update the topology data file, and restart
> the datanode, I just see the log in the namenode that:
> 2012-09-13 10:35:25,074 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /default-rack/ipc:50010
> So should I restart the namenode?
> Is there some command like 'hadoop dfsadmin -refreshtopology'?
>
> My configuration:
>
> core-site.xml:
> <property>
> <name>topology.script.file.name</name>
> <value>conf/rack-awareness.sh</value>
> </property>
> <property>
> <name>topology.script.number.args</name>
> <value>1000</value>
> </property>
>
> conf/rack-awareness.sh:
> #!/bin/sh
>
> HADOOP_CONF=/opt/hadoop/conf
>
> while [ $# -gt 0 ] ; do
>   nodeArg=$1
>   exec< ${HADOOP_CONF}/topology.data
>   result=""
>   while read line ; do
>     ar=( $line )
>     if [ "${ar[0]}" = "$nodeArg" ] ; then
>       result="${ar[1]}"
>     fi
>   done
>   shift
>   if [ -z "$result" ] ; then
>     echo -n "/default-rack "
>   else
>     echo -n "$result "
>   fi
> done
>
> topology.data:
> ipa rackA
> ipb rackA
> ipc rackB
>
>
> And also I have search the mailing list "topology.script.file update":
> I found a mail that:
> Tom Hall 2011-10-27, 16:07
> I was hoping that if I updated the file it would give new answers as
> datanodes were restarted and reconnected but that does not seem to be
> the case.
> Surely I dont need to restart the namenode...
>
> But there is not replying.
> So somebody can help me?
>
>
> 专注于Mysql,MSSQL,Oracle,Hadoop



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Jameson,

The right process to add a new node with the right mapping is:

1. Update topology file for the new DN.
2. Issue a dfsadmin -refreshNodes to get new topology mapping updated in NN.
3. Start the DN only after (2) so it picks up the right mapping and a
default mapping does not get cached.

On Thu, Sep 13, 2012 at 8:33 AM, Jameson Li <ho...@gmail.com> wrote:
> Our hadoop version is hadoop-0.20-append+4.
>
> We have configured the rack awareness in the namenode.
> But when I add new datanode, and update the topology data file, and restart
> the datanode, I just see the log in the namenode that:
> 2012-09-13 10:35:25,074 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /default-rack/ipc:50010
> So should I restart the namenode?
> Is there some command like 'hadoop dfsadmin -refreshtopology'?
>
> My configuration:
>
> core-site.xml:
> <property>
> <name>topology.script.file.name</name>
> <value>conf/rack-awareness.sh</value>
> </property>
> <property>
> <name>topology.script.number.args</name>
> <value>1000</value>
> </property>
>
> conf/rack-awareness.sh:
> #!/bin/sh
>
> HADOOP_CONF=/opt/hadoop/conf
>
> while [ $# -gt 0 ] ; do
>   nodeArg=$1
>   exec< ${HADOOP_CONF}/topology.data
>   result=""
>   while read line ; do
>     ar=( $line )
>     if [ "${ar[0]}" = "$nodeArg" ] ; then
>       result="${ar[1]}"
>     fi
>   done
>   shift
>   if [ -z "$result" ] ; then
>     echo -n "/default-rack "
>   else
>     echo -n "$result "
>   fi
> done
>
> topology.data:
> ipa rackA
> ipb rackA
> ipc rackB
>
>
> And also I have search the mailing list "topology.script.file update":
> I found a mail that:
> Tom Hall 2011-10-27, 16:07
> I was hoping that if I updated the file it would give new answers as
> datanodes were restarted and reconnected but that does not seem to be
> the case.
> Surely I dont need to restart the namenode...
>
> But there is not replying.
> So somebody can help me?
>
>
> 专注于Mysql,MSSQL,Oracle,Hadoop



-- 
Harsh J

Re: rack topology data update

Posted by Harsh J <ha...@cloudera.com>.
Jameson,

The right process to add a new node with the right mapping is:

1. Update topology file for the new DN.
2. Issue a dfsadmin -refreshNodes to get new topology mapping updated in NN.
3. Start the DN only after (2) so it picks up the right mapping and a
default mapping does not get cached.

On Thu, Sep 13, 2012 at 8:33 AM, Jameson Li <ho...@gmail.com> wrote:
> Our hadoop version is hadoop-0.20-append+4.
>
> We have configured the rack awareness in the namenode.
> But when I add new datanode, and update the topology data file, and restart
> the datanode, I just see the log in the namenode that:
> 2012-09-13 10:35:25,074 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /default-rack/ipc:50010
> So should I restart the namenode?
> Is there some command like 'hadoop dfsadmin -refreshtopology'?
>
> My configuration:
>
> core-site.xml:
> <property>
> <name>topology.script.file.name</name>
> <value>conf/rack-awareness.sh</value>
> </property>
> <property>
> <name>topology.script.number.args</name>
> <value>1000</value>
> </property>
>
> conf/rack-awareness.sh:
> #!/bin/sh
>
> HADOOP_CONF=/opt/hadoop/conf
>
> while [ $# -gt 0 ] ; do
>   nodeArg=$1
>   exec< ${HADOOP_CONF}/topology.data
>   result=""
>   while read line ; do
>     ar=( $line )
>     if [ "${ar[0]}" = "$nodeArg" ] ; then
>       result="${ar[1]}"
>     fi
>   done
>   shift
>   if [ -z "$result" ] ; then
>     echo -n "/default-rack "
>   else
>     echo -n "$result "
>   fi
> done
>
> topology.data:
> ipa rackA
> ipb rackA
> ipc rackB
>
>
> And also I have search the mailing list "topology.script.file update":
> I found a mail that:
> Tom Hall 2011-10-27, 16:07
> I was hoping that if I updated the file it would give new answers as
> datanodes were restarted and reconnected but that does not seem to be
> the case.
> Surely I dont need to restart the namenode...
>
> But there is not replying.
> So somebody can help me?
>
>
> 专注于Mysql,MSSQL,Oracle,Hadoop



-- 
Harsh J