You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by ps...@gmail.com on 2010/11/30 19:05:29 UTC

FILE_BYTES_WRITTEN and HDFS_BYTES_WRITTEN

When an hadoop MapReduce example is executed, at the end of the example  
it's showed a table with all the information about the execution, like the  
number of Map and Reduce tasks executed, the number of bytes read and  
written.

In this information it exists 2 fields FILE_BYTES_WRITTEN and  
HDFS_BYTES_WRITTEN. What's the difference between these 2 fields?

Thanks,
Pedro

Re: FILE_BYTES_WRITTEN and HDFS_BYTES_WRITTEN

Posted by Pedro Costa <ps...@gmail.com>.
This is not what I ask. Maybe I haven't expressed correctly.

1 - I suppose that the map tasks are the only tasks that write to
local disks. This happens during the creation of the map output. The
FILE_BYTES_WRITTEN corresponds only to the total of number of bytes
produced by all map tasks during the creation of the map outputs or
exists another phase beside the creation of the map outputs that
contributes to the sum of the field (for example, the creation of the
directories during the setup phase)?

2 - The HDFS_BYTES_WRITTEN is the sum of bytes of the output of all
Reduce Tasks saved in HDFS?

On Tue, Nov 30, 2010 at 8:43 PM, Niels Basjes <Ni...@basjes.nl> wrote:
> For some parts of a task the system stores information on the local
> (non-HDFS) file system of the node that is actually running the job.
> That is the FILE_.. Stuff written to HDFS is the HDFS_...
>
> HTH
>
> 2010/11/30  <ps...@gmail.com>:
>> When an hadoop MapReduce example is executed, at the end of the example it's
>> showed a table with all the information about the execution, like the number
>> of Map and Reduce tasks executed, the number of bytes read and written.
>>
>> In this information it exists 2 fields FILE_BYTES_WRITTEN and
>> HDFS_BYTES_WRITTEN. What's the difference between these 2 fields?
>>
>> Thanks,
>> Pedro
>
>
>
> --
> Met vriendelijke groeten,
>
> Niels Basjes
>



-- 
Pedro

Re: FILE_BYTES_WRITTEN and HDFS_BYTES_WRITTEN

Posted by Niels Basjes <Ni...@basjes.nl>.
For some parts of a task the system stores information on the local
(non-HDFS) file system of the node that is actually running the job.
That is the FILE_.. Stuff written to HDFS is the HDFS_...

HTH

2010/11/30  <ps...@gmail.com>:
> When an hadoop MapReduce example is executed, at the end of the example it's
> showed a table with all the information about the execution, like the number
> of Map and Reduce tasks executed, the number of bytes read and written.
>
> In this information it exists 2 fields FILE_BYTES_WRITTEN and
> HDFS_BYTES_WRITTEN. What's the difference between these 2 fields?
>
> Thanks,
> Pedro



-- 
Met vriendelijke groeten,

Niels Basjes