You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Vivi Lang <sq...@gmail.com> on 2012/09/13 02:10:02 UTC

Question about

Hi all,

Is there anyone who can tell me that when we lanuch a mapreduce task, for
example, wordcount, after the JobClient obtained the block locations (the
related hosts/datanodes are stored in the specified split), which
function/class will be called for reading those blocks from the datanode?

Thanks,
Vivian

Re: Question about

Posted by Harsh J <ha...@cloudera.com>.
MR does not read the files in the front-end (unless a partitioner such
as the TOP demands it). The actual block-level read is done via the
DFSClient class (its sub-classes DFSInputStream and DFSOutputStream -
the first one should be where your interest lies.)

All MR cares about is scheduling the data locally, so it just takes
the block locations (metadata) to conjure up split objects for the
scheduler and the task and sends it across.

On Thu, Sep 13, 2012 at 5:40 AM, Vivi Lang <sq...@gmail.com> wrote:
> Hi all,
>
> Is there anyone who can tell me that when we lanuch a mapreduce task, for
> example, wordcount, after the JobClient obtained the block locations (the
> related hosts/datanodes are stored in the specified split), which
> function/class will be called for reading those blocks from the datanode?
>
> Thanks,
> Vivian



-- 
Harsh J

RE: Question about

Posted by Charles Baker <cb...@sdl.com>.
Hi Vivian. Take a look at TextInputFormat and the RecordReader classes. This
is set via JobConf.setInputFormat().

-Chuck

-----Original Message-----
From: Vivi Lang [mailto:sqlxweiwei@gmail.com] 
Sent: Wednesday, September 12, 2012 5:10 PM
To: hdfs-dev@hadoop.apache.org
Subject: Question about

Hi all,

Is there anyone who can tell me that when we lanuch a mapreduce task, for
example, wordcount, after the JobClient obtained the block locations (the
related hosts/datanodes are stored in the specified split), which
function/class will be called for reading those blocks from the datanode?

Thanks,
Vivian
SDL Enterprise Technologies, Inc. - all rights reserved.  The information contained in this email may be confidential and/or legally privileged. It has been sent for the sole use of the intended recipient(s). If you are not the intended recipient of this mail, you are hereby notified that any unauthorized review, use, disclosure, dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please reply to the sender and destroy all copies of the message.
Registered address: 69 Hickory Drive, 3rd Floor, Waltham, MA 02451, USA