You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Colin Patrick McCabe (Created) (JIRA)" <ji...@apache.org> on 2012/04/18 22:14:40 UTC

[jira] [Created] (HDFS-3297) Update free space in the DataBlockScanner rather than using du

Update free space in the DataBlockScanner rather than using du
--------------------------------------------------------------

                 Key: HDFS-3297
                 URL: https://issues.apache.org/jira/browse/HDFS-3297
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: data-node
    Affects Versions: 0.23.0
            Reporter: Colin Patrick McCabe
            Assignee: Colin Patrick McCabe
            Priority: Minor


As the DataNode adds new blocks to a BlockPool, it keeps track of how much space that block pool consumes.  This information gets sent to the NameNode so we can track statistics and so forth.

Periodically, we check what's actually on the disk to make sure that the counts we are keeping are accurate.  The DataNode currently kicks off a "du -s" process through the shell every few minutes and takes the result as the new used space number.

We should do this in the DataBlockScanner, rather than using a separate du process.  The main reason to do this is so that we don't cause a lot of random I/O operations on the disk.  Since du has to visit every file in the BlockPool, it is essentially re-doing the work of the block scanner, for no reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira