You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Steve Brenneis <sw...@gmail.com> on 2016/10/04 20:22:21 UTC

HDFS Issues.

I have an HDFS cluster of three nodes. They are all running on Amazon EC2 instances. I am using HDFS for an HBase backing store. Periodically, I will start the cluster and the name node stays in safe mode because it says the number of live datanodes has dropped to 0.

The number of live datanodes 2 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
The datanode logs appear to be normal, with no errors indicated. The dfsadmin report says the datanodes are both normal and that the name node is in contact with them.

Safe mode is ON
Configured Capacity: 16637566976 (15.49 GB)
Present Capacity: 7941234688 (7.40 GB)
DFS Remaining: 7940620288 (7.40 GB)
DFS Used: 614400 (600 KB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (2):

Name: 172.31.52.176:50010 (dev2)
Hostname: dev2
Decommission Status : Normal
Configured Capacity: 8318783488 (7.75 GB)
DFS Used: 307200 (300 KB)
Non DFS Used: 3257020416 (3.03 GB)
DFS Remaining: 5061455872 (4.71 GB)
DFS Used%: 0.00%
DFS Remaining%: 60.84%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 04 15:47:00 EDT 2016


Name: 172.31.63.188:50010 (dev1)
Hostname: dev1
Decommission Status : Normal
Configured Capacity: 8318783488 (7.75 GB)
DFS Used: 307200 (300 KB)
Non DFS Used: 5439311872 (5.07 GB)
DFS Remaining: 2879164416 (2.68 GB)
DFS Used%: 0.00%
DFS Remaining%: 34.61%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 04 15:47:00 EDT 2016
If I force the name node out of safe mode, the fsck commmand says that the file system is corrupt. When this happens, the only thing I've been able to do to get it back is to format the HDFS file system. I have not changed the configuration of the cluster. This just randomly seems to occur. The system is in development, but this will be unacceptable in production.

I’m using version 2.7.3. Thank you in advance for any help.

Re: HDFS Issues.

Posted by Ravi Prakash <ra...@gmail.com>.

There are a few conditions for the Namenode to come out of safemode.
# Number of datanodes,
# Number of blocks that have been reported.

How many blocks have the datanodes reported?

On Tue, Oct 4, 2016 at 1:22 PM, Steve Brenneis <sw...@gmail.com> wrote:

> I have an HDFS cluster of three nodes. They are all running on Amazon EC2
> instances. I am using HDFS for an HBase backing store. Periodically, I will
> start the cluster and the name node stays in safe mode because it says the
> number of live datanodes has dropped to 0.
>
> The number of live datanodes 2 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
>
> The datanode logs appear to be normal, with no errors indicated. The
> dfsadmin report says the datanodes are both normal and that the name node
> is in contact with them.
>
> Safe mode is ON
> Configured Capacity: 16637566976 (15.49 GB)
> Present Capacity: 7941234688 (7.40 GB)
> DFS Remaining: 7940620288 (7.40 GB)
> DFS Used: 614400 (600 KB)
> DFS Used%: 0.01%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
>
> -------------------------------------------------
> Live datanodes (2):
>
> Name: 172.31.52.176:50010 (dev2)
> Hostname: dev2
> Decommission Status : Normal
> Configured Capacity: 8318783488 (7.75 GB)
> DFS Used: 307200 (300 KB)
> Non DFS Used: 3257020416 (3.03 GB)
> DFS Remaining: 5061455872 (4.71 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 60.84%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Tue Oct 04 15:47:00 EDT 2016
>
>
> Name: 172.31.63.188:50010 (dev1)
> Hostname: dev1
> Decommission Status : Normal
> Configured Capacity: 8318783488 (7.75 GB)
> DFS Used: 307200 (300 KB)
> Non DFS Used: 5439311872 (5.07 GB)
> DFS Remaining: 2879164416 (2.68 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 34.61%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Tue Oct 04 15:47:00 EDT 2016
>
> If I force the name node out of safe mode, the fsck commmand says that the
> file system is corrupt. When this happens, the only thing I've been able to
> do to get it back is to format the HDFS file system. I have not changed the
> configuration of the cluster. This just randomly seems to occur. The system
> is in development, but this will be unacceptable in production.
> I’m using version 2.7.3. Thank you in advance for any help.
>
>