You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Georgi Ivanov <iv...@vesseltracker.com> on 2014/10/22 13:47:24 UTC

HDFS multiple dfs_data_dir disbalance

Hi,
My cluster is configured with 2 data dirs.
/data/1
/data/2

Usually hadoop is balancing the utilization of these dirs.
Now i have one node where /data/1 is 100% full and /data/2 is not.

Is there anything i can do about this, as this results in failed 
mapppers/rdecers ?

Georgi



RE: HDFS multiple dfs_data_dir disbalance

Posted by Rakesh R <ra...@huawei.com>.
Yes, there is a VolumeChoosingPolicy configuration "dfs.datanode.fsdataset.volume.choosing.policy" in hdfs 
and by default it is configured to choose volumes in round-robin order. Hope you are using the default policy?


Did you see any warnings or errors about the '/data/2' in datanode logs, datanode will do sanity checks 
and if there is any exception it will skip this directory. Probably you can see this chance.

Regards,
Rakesh

-----Original Message-----
From: Georgi Ivanov [mailto:ivanov@vesseltracker.com] 
Sent: 22 October 2014 17:17
To: user@hadoop.apache.org
Subject: HDFS multiple dfs_data_dir disbalance

Hi,
My cluster is configured with 2 data dirs.
/data/1
/data/2

Usually hadoop is balancing the utilization of these dirs.
Now i have one node where /data/1 is 100% full and /data/2 is not.

Is there anything i can do about this, as this results in failed mapppers/rdecers ?

Georgi



RE: HDFS multiple dfs_data_dir disbalance

Posted by Rakesh R <ra...@huawei.com>.
Yes, there is a VolumeChoosingPolicy configuration "dfs.datanode.fsdataset.volume.choosing.policy" in hdfs 
and by default it is configured to choose volumes in round-robin order. Hope you are using the default policy?


Did you see any warnings or errors about the '/data/2' in datanode logs, datanode will do sanity checks 
and if there is any exception it will skip this directory. Probably you can see this chance.

Regards,
Rakesh

-----Original Message-----
From: Georgi Ivanov [mailto:ivanov@vesseltracker.com] 
Sent: 22 October 2014 17:17
To: user@hadoop.apache.org
Subject: HDFS multiple dfs_data_dir disbalance

Hi,
My cluster is configured with 2 data dirs.
/data/1
/data/2

Usually hadoop is balancing the utilization of these dirs.
Now i have one node where /data/1 is 100% full and /data/2 is not.

Is there anything i can do about this, as this results in failed mapppers/rdecers ?

Georgi



Re: HDFS multiple dfs_data_dir disbalance

Posted by Georgi Ivanov <iv...@vesseltracker.com>.
Thanks for the reply.

Unfortunately there is no extra data in this dir.

This if form the DN log :
2014-10-22 15:29:00,205 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
Added volume - /data/1/dfs/dn/current
2014-10-22 15:29:00,205 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
Added volume - /data/2/dfs/dn/current

Scheduling blk_8969115446695150692_1784945 file 
/data/2/dfs/dn/current/BP-1312742174-78.46.149.194-1359718879114/current/finalized/subdir44/subdir31/blk_8969115446695150692 
for deletion

So i can see /data/2 is used.

These dirs are actually 2 different disks. I now remember that one of 
those died recently and was replaced.

I don't see errors with fsck .

Name: xx.xx.xx.xx:50010 (dn2.domain.com)
Hostname: dn2
Rack: /default
Decommission Status : Normal
Configured Capacity: 5651963387904 (5.14 TB)
DFS Used: 3548748439552 (3.23 TB)
Non DFS Used: 267148099584 (248.80 GB)
DFS Remaining: 1836066848768 (1.67 TB)
DFS Used%: 62.79%
DFS Remaining%: 32.49%
Last contact: Wed Oct 22 15:42:46 CEST 2014

As you can see here is another proof that /data/2 is used, as we still 
have 1.67TB free. If it was not used we would have ~0% free

So the dir is used, but it is not balanced.
I think this is because of the disk crash.
But isn't it supposed Hadoop to fix this ?
The disk was replaced few weeks ago...

Georgi

On 22.10.2014 15:05, Brahma Reddy Battula wrote:
> does /data1 is having non-hadoop data..? Please check for same..
> check admin report(hdfs dfsadmin -report) and fsck report (hdfs fsck /)
>
> Thinking that following might not
> a) /data/2 is not having the permission to write data where volumes tolerated configured as 1
> b) /data/2 is not added after some time
>
>
>
> Thanks & Regards
> Brahma Reddy Battula
> ________________________________________
> From: Georgi Ivanov [ivanov@vesseltracker.com]
> Sent: Wednesday, October 22, 2014 5:17 PM
> To: user@hadoop.apache.org
> Subject: HDFS multiple dfs_data_dir disbalance
>
> Hi,
> My cluster is configured with 2 data dirs.
> /data/1
> /data/2
>
> Usually hadoop is balancing the utilization of these dirs.
> Now i have one node where /data/1 is 100% full and /data/2 is not.
>
> Is there anything i can do about this, as this results in failed
> mapppers/rdecers ?
>
> Georgi
>
>
>
>


Re: HDFS multiple dfs_data_dir disbalance

Posted by Georgi Ivanov <iv...@vesseltracker.com>.
Thanks for the reply.

Unfortunately there is no extra data in this dir.

This if form the DN log :
2014-10-22 15:29:00,205 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
Added volume - /data/1/dfs/dn/current
2014-10-22 15:29:00,205 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
Added volume - /data/2/dfs/dn/current

Scheduling blk_8969115446695150692_1784945 file 
/data/2/dfs/dn/current/BP-1312742174-78.46.149.194-1359718879114/current/finalized/subdir44/subdir31/blk_8969115446695150692 
for deletion

So i can see /data/2 is used.

These dirs are actually 2 different disks. I now remember that one of 
those died recently and was replaced.

I don't see errors with fsck .

Name: xx.xx.xx.xx:50010 (dn2.domain.com)
Hostname: dn2
Rack: /default
Decommission Status : Normal
Configured Capacity: 5651963387904 (5.14 TB)
DFS Used: 3548748439552 (3.23 TB)
Non DFS Used: 267148099584 (248.80 GB)
DFS Remaining: 1836066848768 (1.67 TB)
DFS Used%: 62.79%
DFS Remaining%: 32.49%
Last contact: Wed Oct 22 15:42:46 CEST 2014

As you can see here is another proof that /data/2 is used, as we still 
have 1.67TB free. If it was not used we would have ~0% free

So the dir is used, but it is not balanced.
I think this is because of the disk crash.
But isn't it supposed Hadoop to fix this ?
The disk was replaced few weeks ago...

Georgi

On 22.10.2014 15:05, Brahma Reddy Battula wrote:
> does /data1 is having non-hadoop data..? Please check for same..
> check admin report(hdfs dfsadmin -report) and fsck report (hdfs fsck /)
>
> Thinking that following might not
> a) /data/2 is not having the permission to write data where volumes tolerated configured as 1
> b) /data/2 is not added after some time
>
>
>
> Thanks & Regards
> Brahma Reddy Battula
> ________________________________________
> From: Georgi Ivanov [ivanov@vesseltracker.com]
> Sent: Wednesday, October 22, 2014 5:17 PM
> To: user@hadoop.apache.org
> Subject: HDFS multiple dfs_data_dir disbalance
>
> Hi,
> My cluster is configured with 2 data dirs.
> /data/1
> /data/2
>
> Usually hadoop is balancing the utilization of these dirs.
> Now i have one node where /data/1 is 100% full and /data/2 is not.
>
> Is there anything i can do about this, as this results in failed
> mapppers/rdecers ?
>
> Georgi
>
>
>
>


Re: HDFS multiple dfs_data_dir disbalance

Posted by Georgi Ivanov <iv...@vesseltracker.com>.
Thanks for the reply.

Unfortunately there is no extra data in this dir.

This if form the DN log :
2014-10-22 15:29:00,205 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
Added volume - /data/1/dfs/dn/current
2014-10-22 15:29:00,205 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
Added volume - /data/2/dfs/dn/current

Scheduling blk_8969115446695150692_1784945 file 
/data/2/dfs/dn/current/BP-1312742174-78.46.149.194-1359718879114/current/finalized/subdir44/subdir31/blk_8969115446695150692 
for deletion

So i can see /data/2 is used.

These dirs are actually 2 different disks. I now remember that one of 
those died recently and was replaced.

I don't see errors with fsck .

Name: xx.xx.xx.xx:50010 (dn2.domain.com)
Hostname: dn2
Rack: /default
Decommission Status : Normal
Configured Capacity: 5651963387904 (5.14 TB)
DFS Used: 3548748439552 (3.23 TB)
Non DFS Used: 267148099584 (248.80 GB)
DFS Remaining: 1836066848768 (1.67 TB)
DFS Used%: 62.79%
DFS Remaining%: 32.49%
Last contact: Wed Oct 22 15:42:46 CEST 2014

As you can see here is another proof that /data/2 is used, as we still 
have 1.67TB free. If it was not used we would have ~0% free

So the dir is used, but it is not balanced.
I think this is because of the disk crash.
But isn't it supposed Hadoop to fix this ?
The disk was replaced few weeks ago...

Georgi

On 22.10.2014 15:05, Brahma Reddy Battula wrote:
> does /data1 is having non-hadoop data..? Please check for same..
> check admin report(hdfs dfsadmin -report) and fsck report (hdfs fsck /)
>
> Thinking that following might not
> a) /data/2 is not having the permission to write data where volumes tolerated configured as 1
> b) /data/2 is not added after some time
>
>
>
> Thanks & Regards
> Brahma Reddy Battula
> ________________________________________
> From: Georgi Ivanov [ivanov@vesseltracker.com]
> Sent: Wednesday, October 22, 2014 5:17 PM
> To: user@hadoop.apache.org
> Subject: HDFS multiple dfs_data_dir disbalance
>
> Hi,
> My cluster is configured with 2 data dirs.
> /data/1
> /data/2
>
> Usually hadoop is balancing the utilization of these dirs.
> Now i have one node where /data/1 is 100% full and /data/2 is not.
>
> Is there anything i can do about this, as this results in failed
> mapppers/rdecers ?
>
> Georgi
>
>
>
>


Re: HDFS multiple dfs_data_dir disbalance

Posted by Georgi Ivanov <iv...@vesseltracker.com>.
Thanks for the reply.

Unfortunately there is no extra data in this dir.

This if form the DN log :
2014-10-22 15:29:00,205 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
Added volume - /data/1/dfs/dn/current
2014-10-22 15:29:00,205 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
Added volume - /data/2/dfs/dn/current

Scheduling blk_8969115446695150692_1784945 file 
/data/2/dfs/dn/current/BP-1312742174-78.46.149.194-1359718879114/current/finalized/subdir44/subdir31/blk_8969115446695150692 
for deletion

So i can see /data/2 is used.

These dirs are actually 2 different disks. I now remember that one of 
those died recently and was replaced.

I don't see errors with fsck .

Name: xx.xx.xx.xx:50010 (dn2.domain.com)
Hostname: dn2
Rack: /default
Decommission Status : Normal
Configured Capacity: 5651963387904 (5.14 TB)
DFS Used: 3548748439552 (3.23 TB)
Non DFS Used: 267148099584 (248.80 GB)
DFS Remaining: 1836066848768 (1.67 TB)
DFS Used%: 62.79%
DFS Remaining%: 32.49%
Last contact: Wed Oct 22 15:42:46 CEST 2014

As you can see here is another proof that /data/2 is used, as we still 
have 1.67TB free. If it was not used we would have ~0% free

So the dir is used, but it is not balanced.
I think this is because of the disk crash.
But isn't it supposed Hadoop to fix this ?
The disk was replaced few weeks ago...

Georgi

On 22.10.2014 15:05, Brahma Reddy Battula wrote:
> does /data1 is having non-hadoop data..? Please check for same..
> check admin report(hdfs dfsadmin -report) and fsck report (hdfs fsck /)
>
> Thinking that following might not
> a) /data/2 is not having the permission to write data where volumes tolerated configured as 1
> b) /data/2 is not added after some time
>
>
>
> Thanks & Regards
> Brahma Reddy Battula
> ________________________________________
> From: Georgi Ivanov [ivanov@vesseltracker.com]
> Sent: Wednesday, October 22, 2014 5:17 PM
> To: user@hadoop.apache.org
> Subject: HDFS multiple dfs_data_dir disbalance
>
> Hi,
> My cluster is configured with 2 data dirs.
> /data/1
> /data/2
>
> Usually hadoop is balancing the utilization of these dirs.
> Now i have one node where /data/1 is 100% full and /data/2 is not.
>
> Is there anything i can do about this, as this results in failed
> mapppers/rdecers ?
>
> Georgi
>
>
>
>


RE: HDFS multiple dfs_data_dir disbalance

Posted by Brahma Reddy Battula <br...@huawei.com>.
does /data1 is having non-hadoop data..? Please check for same..
check admin report(hdfs dfsadmin -report) and fsck report (hdfs fsck /)

Thinking that following might not 
a) /data/2 is not having the permission to write data where volumes tolerated configured as 1
b) /data/2 is not added after some time



Thanks & Regards
Brahma Reddy Battula
________________________________________
From: Georgi Ivanov [ivanov@vesseltracker.com]
Sent: Wednesday, October 22, 2014 5:17 PM
To: user@hadoop.apache.org
Subject: HDFS multiple dfs_data_dir disbalance

Hi,
My cluster is configured with 2 data dirs.
/data/1
/data/2

Usually hadoop is balancing the utilization of these dirs.
Now i have one node where /data/1 is 100% full and /data/2 is not.

Is there anything i can do about this, as this results in failed
mapppers/rdecers ?

Georgi



RE: HDFS multiple dfs_data_dir disbalance

Posted by Brahma Reddy Battula <br...@huawei.com>.
does /data1 is having non-hadoop data..? Please check for same..
check admin report(hdfs dfsadmin -report) and fsck report (hdfs fsck /)

Thinking that following might not 
a) /data/2 is not having the permission to write data where volumes tolerated configured as 1
b) /data/2 is not added after some time



Thanks & Regards
Brahma Reddy Battula
________________________________________
From: Georgi Ivanov [ivanov@vesseltracker.com]
Sent: Wednesday, October 22, 2014 5:17 PM
To: user@hadoop.apache.org
Subject: HDFS multiple dfs_data_dir disbalance

Hi,
My cluster is configured with 2 data dirs.
/data/1
/data/2

Usually hadoop is balancing the utilization of these dirs.
Now i have one node where /data/1 is 100% full and /data/2 is not.

Is there anything i can do about this, as this results in failed
mapppers/rdecers ?

Georgi



RE: HDFS multiple dfs_data_dir disbalance

Posted by Rakesh R <ra...@huawei.com>.
Yes, there is a VolumeChoosingPolicy configuration "dfs.datanode.fsdataset.volume.choosing.policy" in hdfs 
and by default it is configured to choose volumes in round-robin order. Hope you are using the default policy?


Did you see any warnings or errors about the '/data/2' in datanode logs, datanode will do sanity checks 
and if there is any exception it will skip this directory. Probably you can see this chance.

Regards,
Rakesh

-----Original Message-----
From: Georgi Ivanov [mailto:ivanov@vesseltracker.com] 
Sent: 22 October 2014 17:17
To: user@hadoop.apache.org
Subject: HDFS multiple dfs_data_dir disbalance

Hi,
My cluster is configured with 2 data dirs.
/data/1
/data/2

Usually hadoop is balancing the utilization of these dirs.
Now i have one node where /data/1 is 100% full and /data/2 is not.

Is there anything i can do about this, as this results in failed mapppers/rdecers ?

Georgi



RE: HDFS multiple dfs_data_dir disbalance

Posted by Rakesh R <ra...@huawei.com>.
Yes, there is a VolumeChoosingPolicy configuration "dfs.datanode.fsdataset.volume.choosing.policy" in hdfs 
and by default it is configured to choose volumes in round-robin order. Hope you are using the default policy?


Did you see any warnings or errors about the '/data/2' in datanode logs, datanode will do sanity checks 
and if there is any exception it will skip this directory. Probably you can see this chance.

Regards,
Rakesh

-----Original Message-----
From: Georgi Ivanov [mailto:ivanov@vesseltracker.com] 
Sent: 22 October 2014 17:17
To: user@hadoop.apache.org
Subject: HDFS multiple dfs_data_dir disbalance

Hi,
My cluster is configured with 2 data dirs.
/data/1
/data/2

Usually hadoop is balancing the utilization of these dirs.
Now i have one node where /data/1 is 100% full and /data/2 is not.

Is there anything i can do about this, as this results in failed mapppers/rdecers ?

Georgi



RE: HDFS multiple dfs_data_dir disbalance

Posted by Brahma Reddy Battula <br...@huawei.com>.
does /data1 is having non-hadoop data..? Please check for same..
check admin report(hdfs dfsadmin -report) and fsck report (hdfs fsck /)

Thinking that following might not 
a) /data/2 is not having the permission to write data where volumes tolerated configured as 1
b) /data/2 is not added after some time



Thanks & Regards
Brahma Reddy Battula
________________________________________
From: Georgi Ivanov [ivanov@vesseltracker.com]
Sent: Wednesday, October 22, 2014 5:17 PM
To: user@hadoop.apache.org
Subject: HDFS multiple dfs_data_dir disbalance

Hi,
My cluster is configured with 2 data dirs.
/data/1
/data/2

Usually hadoop is balancing the utilization of these dirs.
Now i have one node where /data/1 is 100% full and /data/2 is not.

Is there anything i can do about this, as this results in failed
mapppers/rdecers ?

Georgi



RE: HDFS multiple dfs_data_dir disbalance

Posted by Brahma Reddy Battula <br...@huawei.com>.
does /data1 is having non-hadoop data..? Please check for same..
check admin report(hdfs dfsadmin -report) and fsck report (hdfs fsck /)

Thinking that following might not 
a) /data/2 is not having the permission to write data where volumes tolerated configured as 1
b) /data/2 is not added after some time



Thanks & Regards
Brahma Reddy Battula
________________________________________
From: Georgi Ivanov [ivanov@vesseltracker.com]
Sent: Wednesday, October 22, 2014 5:17 PM
To: user@hadoop.apache.org
Subject: HDFS multiple dfs_data_dir disbalance

Hi,
My cluster is configured with 2 data dirs.
/data/1
/data/2

Usually hadoop is balancing the utilization of these dirs.
Now i have one node where /data/1 is 100% full and /data/2 is not.

Is there anything i can do about this, as this results in failed
mapppers/rdecers ?

Georgi