You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Pralabh Kumar <pr...@gmail.com> on 2013/04/29 10:50:12 UTC

Relations ship between HDFS_BYTE_READ and Map input bytes

Hi

What's the relationship between HDFS_BYTE_READ and Map input bytes counter
. Why can they be different for particular MR job.

Thanks and Regards

> Pralabh Kumar

Re: Relations ship between HDFS_BYTE_READ and Map input bytes

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.

They can be different if maps read HDFS files directly instead of or on top of getting key-val pairs via the map interface.

HDFS_BYTES_READ will always be greater than or equal to map-input-bytes.

Thanks,
+Vinod

On Apr 29, 2013, at 1:50 AM, Pralabh Kumar wrote:

> Hi
> 
> What's the relationship between HDFS_BYTE_READ and Map input bytes counter . Why can they be different for particular MR job.
> 
> Thanks and Regards
> Pralabh Kumar
>

Re: Relations ship between HDFS_BYTE_READ and Map input bytes

Posted by YouPeng Yang <yy...@gmail.com>.

Hi Pralabh
*   *
 1.The Map input bytes couter belongs to the MapReduce FrameWork. The
hadoop defintive explains that:
   The number of bytes of uncompressed input consumed by all the maps in the
   job. Incremented every time a record is read from a RecordReader and
passed
   to the map’s map() method by the framework.

 2.HDFS_BYTE_READ is the bulid in counter of the  HDFS filesystem :
   The number of bytes read by each filesystem by map and reduce tasks.





2013/4/29 Pralabh Kumar <pr...@gmail.com>

> Hi
>
> What's the relationship between HDFS_BYTE_READ and Map input bytes counter
> . Why can they be different for particular MR job.
>
> Thanks and Regards
>
>> Pralabh Kumar
>
>
>

Re: Relations ship between HDFS_BYTE_READ and Map input bytes

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.

They can be different if maps read HDFS files directly instead of or on top of getting key-val pairs via the map interface.

HDFS_BYTES_READ will always be greater than or equal to map-input-bytes.

Thanks,
+Vinod

On Apr 29, 2013, at 1:50 AM, Pralabh Kumar wrote:

> Hi
> 
> What's the relationship between HDFS_BYTE_READ and Map input bytes counter . Why can they be different for particular MR job.
> 
> Thanks and Regards
> Pralabh Kumar
>

Re: Relations ship between HDFS_BYTE_READ and Map input bytes

Posted by YouPeng Yang <yy...@gmail.com>.

Hi Pralabh
*   *
 1.The Map input bytes couter belongs to the MapReduce FrameWork. The
hadoop defintive explains that:
   The number of bytes of uncompressed input consumed by all the maps in the
   job. Incremented every time a record is read from a RecordReader and
passed
   to the map’s map() method by the framework.

 2.HDFS_BYTE_READ is the bulid in counter of the  HDFS filesystem :
   The number of bytes read by each filesystem by map and reduce tasks.





2013/4/29 Pralabh Kumar <pr...@gmail.com>

> Hi
>
> What's the relationship between HDFS_BYTE_READ and Map input bytes counter
> . Why can they be different for particular MR job.
>
> Thanks and Regards
>
>> Pralabh Kumar
>
>
>

Re: Relations ship between HDFS_BYTE_READ and Map input bytes

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.

They can be different if maps read HDFS files directly instead of or on top of getting key-val pairs via the map interface.

HDFS_BYTES_READ will always be greater than or equal to map-input-bytes.

Thanks,
+Vinod

On Apr 29, 2013, at 1:50 AM, Pralabh Kumar wrote:

> Hi
> 
> What's the relationship between HDFS_BYTE_READ and Map input bytes counter . Why can they be different for particular MR job.
> 
> Thanks and Regards
> Pralabh Kumar
>

Re: Relations ship between HDFS_BYTE_READ and Map input bytes

Posted by YouPeng Yang <yy...@gmail.com>.

Hi Pralabh
*   *
 1.The Map input bytes couter belongs to the MapReduce FrameWork. The
hadoop defintive explains that:
   The number of bytes of uncompressed input consumed by all the maps in the
   job. Incremented every time a record is read from a RecordReader and
passed
   to the map’s map() method by the framework.

 2.HDFS_BYTE_READ is the bulid in counter of the  HDFS filesystem :
   The number of bytes read by each filesystem by map and reduce tasks.





2013/4/29 Pralabh Kumar <pr...@gmail.com>

> Hi
>
> What's the relationship between HDFS_BYTE_READ and Map input bytes counter
> . Why can they be different for particular MR job.
>
> Thanks and Regards
>
>> Pralabh Kumar
>
>
>

Re: Relations ship between HDFS_BYTE_READ and Map input bytes

Posted by YouPeng Yang <yy...@gmail.com>.

Hi Pralabh
*   *
 1.The Map input bytes couter belongs to the MapReduce FrameWork. The
hadoop defintive explains that:
   The number of bytes of uncompressed input consumed by all the maps in the
   job. Incremented every time a record is read from a RecordReader and
passed
   to the map’s map() method by the framework.

 2.HDFS_BYTE_READ is the bulid in counter of the  HDFS filesystem :
   The number of bytes read by each filesystem by map and reduce tasks.





2013/4/29 Pralabh Kumar <pr...@gmail.com>

> Hi
>
> What's the relationship between HDFS_BYTE_READ and Map input bytes counter
> . Why can they be different for particular MR job.
>
> Thanks and Regards
>
>> Pralabh Kumar
>
>
>

Re: Relations ship between HDFS_BYTE_READ and Map input bytes

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.

They can be different if maps read HDFS files directly instead of or on top of getting key-val pairs via the map interface.

HDFS_BYTES_READ will always be greater than or equal to map-input-bytes.

Thanks,
+Vinod

On Apr 29, 2013, at 1:50 AM, Pralabh Kumar wrote:

> Hi
> 
> What's the relationship between HDFS_BYTE_READ and Map input bytes counter . Why can they be different for particular MR job.
> 
> Thanks and Regards
> Pralabh Kumar
>