You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Hiller, Dean (Contractor)" <de...@broadridge.com> on 2011/01/04 14:13:13 UTC
RE: decommision nodes not working(localhost vs. ips in website too) ++
Well, now the decommission is still running 12 hours later. I only have
1.8 gig in hdfs and only .06+.25 needs to be moved. Should this really
be taking more than 12 hours?
Here is the report
Name: 206.88.41.146:50010
Decommission Status : Decommission in progress
Configured Capacity: 10568916992 (9.84 GB)
DFS Used: 67788800 (64.65 MB)
Non DFS Used: 694824960 (662.64 MB)
DFS Remaining: 9806303232(9.13 GB)
DFS Used%: 0.64%
DFS Remaining%: 92.78%
Last contact: Mon Jan 03 23:43:15 MST 2011
Name: 206.88.41.170:50010
Decommission Status : Decommission in progress
Configured Capacity: 10568916992 (9.84 GB)
DFS Used: 270929920 (258.38 MB)
Non DFS Used: 694824960 (662.64 MB)
DFS Remaining: 9603162112(8.94 GB)
DFS Used%: 2.56%
DFS Remaining%: 90.86%
Last contact: Mon Jan 03 23:43:14 MST 2011
Name: 206.88.41.159:50010
Decommission Status : Normal
Configured Capacity: 10568916992 (9.84 GB)
DFS Used: 1804140544 (1.68 GB)
Non DFS Used: 694824960 (662.64 MB)
DFS Remaining: 8069951488(7.52 GB)
DFS Used%: 17.07%
DFS Remaining%: 76.36%
Last contact: Mon Jan 03 23:43:16 MST 2011
Looking in the logs, I only see
2011-01-03 20:03:35,340 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 11
blocks got processed in 10 msecs
2011-01-03 21:03:37,370 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 11
blocks got processed in 0 msecs
2011-01-03 22:03:36,370 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 11
blocks got processed in 0 msecs
2011-01-03 23:03:35,380 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 11
blocks got processed in 0 msecs
Any ideas here?
Thanks,
Dean
From: Hiller, Dean (Contractor)
Sent: Monday, January 03, 2011 6:23 PM
To: 'hdfs-user@hadoop.apache.org'
Subject: RE: decommision nodes not working(localhost vs. ips in website
too) -more info
So I edited /etc/hosts with the ugly
127.0.0.1 <FQDN> <hostname>
And nodes now decommission. This is horrible from adding a node
perspective. We want to quickly add a box, upload the same image, turn
it on, add it to slaves file and refreshNodes(rebalance if wanted) and
be done with it. We don't want to have to edit the /etc/hosts every
time.
Is there any way around this? (I mean it would be awesome if the
DataNode was given the hostname from the master node start-dfs.sh
command(ie. The sshexec basically would pass the parameter from the
slaves file to avoid even needing /etc/hosts).
Any ideas as the docs said 127.0.0.1 localhost should be fine??????
Anyone else decommissioning nodes and not having this issue?
Dean
From: Hiller, Dean (Contractor)
Sent: Monday, January 03, 2011 6:03 PM
To: 'hdfs-user@hadoop.apache.org'
Subject: decommision nodes not working(localhost vs. ips in website too)
Luckily I am in dev so not a biggie, but datanode seems to be reading
from /etc/hosts(ie. Java calls to InetAddress.getLocalHost return
127.0.0.1 instead of the ip) when displaying the name of the live nodes.
When displaying the name fo the dead nodes however, it displays the
hostname in my slaves and excluded file.
I wonder why the hadoop script doesn't pass the FQDN from the slaves
file to the slave node upon start so there is no lookup of /etc/hosts
AND they could then bind to the correct FQDN as well if it wanted to I
guess.
Anyways, my dead node shows up in my live nodes list(as localhost which
it's not but with correct ip) and is not going to a decommissioned
state. Is there any way to solve this?
I read my /etc/hosts file is supposed to be 127.0.0.1 localhost
localhost.localdomain but to get the hostname to display correctly, I
need something more like 127.0.0.1 <FQDN> <hostname> instead as then I
know it would display properly there....and I may even have to change
the 127.0.0.1 to <ip> as InetAddress.getLocalHost returns whatever is in
/etc/hosts on any linux system I have ever been on (ubuntu, centos at
least).
Any way to fix this???
thanks,
Dean
This message and any attachments are intended only for the use of the addressee and
may contain information that is privileged and confidential. If the reader of the
message is not the intended recipient or an authorized representative of the
intended recipient, you are hereby notified that any dissemination of this
communication is strictly prohibited. If you have received this communication in
error, please notify us immediately by e-mail and delete the message and any
attachments from your system.