You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Keith Wiley <kw...@keithwiley.com> on 2013/04/22 19:19:41 UTC

Is disk use reported with replication?

Simple question: When I issue a "hadoop fs -du" command and/or when I view the namenode web UI to see HDFS disk utilization (which the namenode reports both as bytes and percentage), should I expect to see disk use reported as "true data size" or as replicated size (e.g. with 3X replication, should I expect reported values to be three times higher than the actual underlying data itself)?

Thanks.

________________________________________________________________________________
Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"I used to be with it, but then they changed what it was.  Now, what I'm with
isn't it, and what's it seems weird and scary to me."
                                           --  Abe (Grandpa) Simpson
________________________________________________________________________________


Re: Is disk use reported with replication?

Posted by burberry blues <bl...@gmail.com>.
Hi

I am new to Hadoop world. Can you please let me know what is a hadoop stack?

Thanks,
Burberry


On Mon, Apr 22, 2013 at 10:19 AM, Keith Wiley <kw...@keithwiley.com> wrote:

> Simple question: When I issue a "hadoop fs -du" command and/or when I view
> the namenode web UI to see HDFS disk utilization (which the namenode
> reports both as bytes and percentage), should I expect to see disk use
> reported as "true data size" or as replicated size (e.g. with 3X
> replication, should I expect reported values to be three times higher than
> the actual underlying data itself)?
>
> Thanks.
>
>
> ________________________________________________________________________________
> Keith Wiley     kwiley@keithwiley.com     keithwiley.com
> music.keithwiley.com
>
> "I used to be with it, but then they changed what it was.  Now, what I'm
> with
> isn't it, and what's it seems weird and scary to me."
>                                            --  Abe (Grandpa) Simpson
>
> ________________________________________________________________________________
>
>

Re: Is disk use reported with replication?

Posted by Harsh J <ha...@cloudera.com>.
Hi Keith,

The "fs -du" computes length of files, and would not report replicated
on-disk size. HDFS disk utilization OTOH, is the current, simple
report of used/free disk space, which would certainly include
replicated data.

On Mon, Apr 22, 2013 at 10:49 PM, Keith Wiley <kw...@keithwiley.com> wrote:
> Simple question: When I issue a "hadoop fs -du" command and/or when I view the namenode web UI to see HDFS disk utilization (which the namenode reports both as bytes and percentage), should I expect to see disk use reported as "true data size" or as replicated size (e.g. with 3X replication, should I expect reported values to be three times higher than the actual underlying data itself)?
>
> Thanks.
>
> ________________________________________________________________________________
> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
>
> "I used to be with it, but then they changed what it was.  Now, what I'm with
> isn't it, and what's it seems weird and scary to me."
>                                            --  Abe (Grandpa) Simpson
> ________________________________________________________________________________
>



-- 
Harsh J