You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Adarsh Sharma <ad...@orkash.com> on 2010/08/17 14:23:17 UTC

Non-DFS used Parameter

Hi all,
I am not able to understand cleary what *Non-DFS Used  *in 
Hadoop-Namenode Web UI.
I think it is the extra space that occured by temporary map-reduce local 
files.
Can anyone Please tell me how to change that parameter and what it is 
comprised of.

Thanks in Advance.




Re: Non-DFS used Parameter

Posted by Michael Thomas <th...@hep.caltech.edu>.
On 08/17/2010 05:23 AM, Adarsh Sharma wrote:
> Hi all,
> I am not able to understand cleary what *Non-DFS Used *in
> Hadoop-Namenode Web UI.
> I think it is the extra space that occured by temporary map-reduce local
> files.
> Can anyone Please tell me how to change that parameter and what it is
> comprised of.

"Non-DFS used" is any space on the datanode's ${dfs.data.dir} 
_partition_ that is not part of the ${dfs.data.dir}/current _directory_. 
  If your log files or OS share the same partition, then the non-HDFS 
space will be reported as "Non-DFS used".  If you have extra files in MR 
may also make use of this partition, but I can't comment on what it 
might place there.  In the past we have seen non-dfs files in this 
partition, including:

* files in ${dfs.data.dir}/tmp that were stale in-transit blocks left 
behind after a node crash

* Hadoop log files that we misconfigured to write to the same partition 
as ${dfs.data.dir}

* Other non-Hadoop batch system files that were sharing the same 
partition, before we moved HDFS and our batch system to separate partitions.

If you have a persistent large amount of "Non-DFS Used", then you might 
start by running df and du on your ${dfs.data.dir} partition and 
directory structure to track down what these non-dfs files might be.

--Mike