You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by "Brahma Reddy Battula (JIRA)" <ji...@apache.org> on 2016/03/29 06:26:25 UTC

[jira] [Created] (HDFS-10226) Track and use BlockScheduled size for DatanodeDescriptor instead of count.

Brahma Reddy Battula created HDFS-10226:
-------------------------------------------

             Summary: Track and use BlockScheduled size for DatanodeDescriptor instead of count.
                 Key: HDFS-10226
                 URL: https://issues.apache.org/jira/browse/HDFS-10226
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Brahma Reddy Battula
            Assignee: Brahma Reddy Battula


Tracking count will result in inaccurate estimation of remaining space in case of different block sized files being written.

 *For Example:*  
1. Datanode Capacity is 10GB, available is 2GB.
2. For NNBench testing, Low blocksize might be used ( such as 1MB), and currently 20 blocks are being written to DN. Scheduled counter will be 20.
3. This counter will not give any issue for blocks of NNBench with block size as 1MB.

but for normal files with 128MB block size, remaining space will be seen as 0. (because it will calculate based on current file's block size. not the original scheduled size. 
20*128MB =  2.5GB, which is greater than available. So remaining will be 0 for normal block.

here we'll get, 
"Could only be replicated to 0 nodes instead of minReplication (=1 ). There are 2 datanode(s) running and no node(s) are excluded in this operation" exception will come.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)