You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Hiller, Dean (Contractor)" <de...@broadridge.com> on 2011/01/04 14:13:13 UTC

RE: decommision nodes not working(localhost vs. ips in website too) ++

Well, now the decommission is still running 12 hours later.  I only have
1.8 gig in hdfs and only .06+.25 needs to be moved.  Should this really
be taking more than 12 hours?  

 

Here is the report

 

Name: 206.88.41.146:50010

Decommission Status : Decommission in progress

Configured Capacity: 10568916992 (9.84 GB)

DFS Used: 67788800 (64.65 MB)

Non DFS Used: 694824960 (662.64 MB)

DFS Remaining: 9806303232(9.13 GB)

DFS Used%: 0.64%

DFS Remaining%: 92.78%

Last contact: Mon Jan 03 23:43:15 MST 2011

 

Name: 206.88.41.170:50010

Decommission Status : Decommission in progress

Configured Capacity: 10568916992 (9.84 GB)

DFS Used: 270929920 (258.38 MB)

Non DFS Used: 694824960 (662.64 MB)

DFS Remaining: 9603162112(8.94 GB)

DFS Used%: 2.56%

DFS Remaining%: 90.86%

Last contact: Mon Jan 03 23:43:14 MST 2011

 

Name: 206.88.41.159:50010

Decommission Status : Normal

Configured Capacity: 10568916992 (9.84 GB)

DFS Used: 1804140544 (1.68 GB)

Non DFS Used: 694824960 (662.64 MB)

DFS Remaining: 8069951488(7.52 GB)

DFS Used%: 17.07%

DFS Remaining%: 76.36%

Last contact: Mon Jan 03 23:43:16 MST 2011

 

Looking in the logs, I only see 

2011-01-03 20:03:35,340 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 11
blocks got processed in 10 msecs

2011-01-03 21:03:37,370 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 11
blocks got processed in 0 msecs

2011-01-03 22:03:36,370 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 11
blocks got processed in 0 msecs

2011-01-03 23:03:35,380 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 11
blocks got processed in 0 msecs

 

Any ideas here?

Thanks,

Dean

 

 

From: Hiller, Dean (Contractor) 
Sent: Monday, January 03, 2011 6:23 PM
To: 'hdfs-user@hadoop.apache.org'
Subject: RE: decommision nodes not working(localhost vs. ips in website
too) -more info

 

So I edited /etc/hosts with the ugly 

 

127.0.0.1 <FQDN> <hostname>

 

And nodes now decommission.  This is horrible from adding a node
perspective.  We want to quickly add a box, upload the same image, turn
it on, add it to slaves file and refreshNodes(rebalance if wanted) and
be done with it.  We don't want to have to edit the /etc/hosts every
time.

 

Is there any way around this?  (I mean it would be awesome if the
DataNode was given the hostname from the master node start-dfs.sh
command(ie. The sshexec basically would pass the parameter from the
slaves file to avoid even needing /etc/hosts).

 

Any ideas as the docs said 127.0.0.1 localhost should be fine??????

 

Anyone else decommissioning nodes and not having this issue?

 

Dean

 

From: Hiller, Dean (Contractor) 
Sent: Monday, January 03, 2011 6:03 PM
To: 'hdfs-user@hadoop.apache.org'
Subject: decommision nodes not working(localhost vs. ips in website too)

 

Luckily I am in dev so not a biggie, but datanode seems to be reading
from /etc/hosts(ie. Java calls to InetAddress.getLocalHost return
127.0.0.1 instead of the ip) when displaying the name of the live nodes.
When displaying the name fo the dead nodes however, it displays the
hostname in my slaves and excluded file.

 

I wonder why the hadoop script doesn't pass the FQDN from the slaves
file to the slave node upon start so there is no lookup of /etc/hosts
AND they could then bind to the correct FQDN as well if it wanted to I
guess.

 

Anyways, my dead node shows up in my live nodes list(as localhost which
it's not but with correct ip) and is not going to a decommissioned
state.  Is there any way to solve this? 

 

I read my /etc/hosts file is supposed to be 127.0.0.1 localhost
localhost.localdomain but to get the hostname to display correctly, I
need something more like 127.0.0.1 <FQDN> <hostname> instead as then I
know it would display properly there....and I may even have to change
the 127.0.0.1 to <ip> as InetAddress.getLocalHost returns whatever is in
/etc/hosts on any linux system I have ever been on (ubuntu, centos at
least).

 

Any way to fix this???

 

thanks,

Dean


This message and any attachments are intended only for the use of the addressee and
may contain information that is privileged and confidential. If the reader of the 
message is not the intended recipient or an authorized representative of the
intended recipient, you are hereby notified that any dissemination of this
communication is strictly prohibited. If you have received this communication in
error, please notify us immediately by e-mail and delete the message and any
attachments from your system.