You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Weishung Chung <we...@gmail.com> on 2012/08/16 20:18:14 UTC

question about file split

Hey fellow developers,

I am trying to figure out in the code base, which class does the handling
of record running across block boundary when reading a file split. I have
been digging through LineRecordReader, FileInputFormat, TextInputFormat,
and etc.

Thank you,
Wei Shung

Re: question about file split

Posted by Weishung Chung <we...@gmail.com>.
Thanks alot...digesting it now :)

On Thu, Aug 16, 2012 at 11:29 AM, Harsh J <ha...@cloudera.com> wrote:

> Weishung,
>
> For text files, this is done by the LineRecordReader.
>
> See
> http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java?view=markup
> .
> Specifically see L126-L131 and lines and the loop around L164 onwards.
> These parts of the logic correlate with the logic described at
> http://wiki.apache.org/hadoop/HadoopMapReduce.
>
> On Thu, Aug 16, 2012 at 11:48 PM, Weishung Chung <we...@gmail.com>
> wrote:
> > Hey fellow developers,
> >
> > I am trying to figure out in the code base, which class does the
> handling of
> > record running across block boundary when reading a file split. I have
> been
> > digging through LineRecordReader, FileInputFormat, TextInputFormat, and
> etc.
> >
> > Thank you,
> > Wei Shung
>
>
>
> --
> Harsh J
>

Re: question about file split

Posted by Weishung Chung <we...@gmail.com>.
Thanks alot...digesting it now :)

On Thu, Aug 16, 2012 at 11:29 AM, Harsh J <ha...@cloudera.com> wrote:

> Weishung,
>
> For text files, this is done by the LineRecordReader.
>
> See
> http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java?view=markup
> .
> Specifically see L126-L131 and lines and the loop around L164 onwards.
> These parts of the logic correlate with the logic described at
> http://wiki.apache.org/hadoop/HadoopMapReduce.
>
> On Thu, Aug 16, 2012 at 11:48 PM, Weishung Chung <we...@gmail.com>
> wrote:
> > Hey fellow developers,
> >
> > I am trying to figure out in the code base, which class does the
> handling of
> > record running across block boundary when reading a file split. I have
> been
> > digging through LineRecordReader, FileInputFormat, TextInputFormat, and
> etc.
> >
> > Thank you,
> > Wei Shung
>
>
>
> --
> Harsh J
>

Re: question about file split

Posted by Weishung Chung <we...@gmail.com>.
Thanks alot...digesting it now :)

On Thu, Aug 16, 2012 at 11:29 AM, Harsh J <ha...@cloudera.com> wrote:

> Weishung,
>
> For text files, this is done by the LineRecordReader.
>
> See
> http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java?view=markup
> .
> Specifically see L126-L131 and lines and the loop around L164 onwards.
> These parts of the logic correlate with the logic described at
> http://wiki.apache.org/hadoop/HadoopMapReduce.
>
> On Thu, Aug 16, 2012 at 11:48 PM, Weishung Chung <we...@gmail.com>
> wrote:
> > Hey fellow developers,
> >
> > I am trying to figure out in the code base, which class does the
> handling of
> > record running across block boundary when reading a file split. I have
> been
> > digging through LineRecordReader, FileInputFormat, TextInputFormat, and
> etc.
> >
> > Thank you,
> > Wei Shung
>
>
>
> --
> Harsh J
>

Re: question about file split

Posted by Weishung Chung <we...@gmail.com>.
Thanks alot...digesting it now :)

On Thu, Aug 16, 2012 at 11:29 AM, Harsh J <ha...@cloudera.com> wrote:

> Weishung,
>
> For text files, this is done by the LineRecordReader.
>
> See
> http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java?view=markup
> .
> Specifically see L126-L131 and lines and the loop around L164 onwards.
> These parts of the logic correlate with the logic described at
> http://wiki.apache.org/hadoop/HadoopMapReduce.
>
> On Thu, Aug 16, 2012 at 11:48 PM, Weishung Chung <we...@gmail.com>
> wrote:
> > Hey fellow developers,
> >
> > I am trying to figure out in the code base, which class does the
> handling of
> > record running across block boundary when reading a file split. I have
> been
> > digging through LineRecordReader, FileInputFormat, TextInputFormat, and
> etc.
> >
> > Thank you,
> > Wei Shung
>
>
>
> --
> Harsh J
>

Re: question about file split

Posted by Harsh J <ha...@cloudera.com>.
Weishung,

For text files, this is done by the LineRecordReader.

See http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java?view=markup.
Specifically see L126-L131 and lines and the loop around L164 onwards.
These parts of the logic correlate with the logic described at
http://wiki.apache.org/hadoop/HadoopMapReduce.

On Thu, Aug 16, 2012 at 11:48 PM, Weishung Chung <we...@gmail.com> wrote:
> Hey fellow developers,
>
> I am trying to figure out in the code base, which class does the handling of
> record running across block boundary when reading a file split. I have been
> digging through LineRecordReader, FileInputFormat, TextInputFormat, and etc.
>
> Thank you,
> Wei Shung



-- 
Harsh J

Re: question about file split

Posted by Harsh J <ha...@cloudera.com>.
Weishung,

For text files, this is done by the LineRecordReader.

See http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java?view=markup.
Specifically see L126-L131 and lines and the loop around L164 onwards.
These parts of the logic correlate with the logic described at
http://wiki.apache.org/hadoop/HadoopMapReduce.

On Thu, Aug 16, 2012 at 11:48 PM, Weishung Chung <we...@gmail.com> wrote:
> Hey fellow developers,
>
> I am trying to figure out in the code base, which class does the handling of
> record running across block boundary when reading a file split. I have been
> digging through LineRecordReader, FileInputFormat, TextInputFormat, and etc.
>
> Thank you,
> Wei Shung



-- 
Harsh J

Re: question about file split

Posted by Harsh J <ha...@cloudera.com>.
Weishung,

For text files, this is done by the LineRecordReader.

See http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java?view=markup.
Specifically see L126-L131 and lines and the loop around L164 onwards.
These parts of the logic correlate with the logic described at
http://wiki.apache.org/hadoop/HadoopMapReduce.

On Thu, Aug 16, 2012 at 11:48 PM, Weishung Chung <we...@gmail.com> wrote:
> Hey fellow developers,
>
> I am trying to figure out in the code base, which class does the handling of
> record running across block boundary when reading a file split. I have been
> digging through LineRecordReader, FileInputFormat, TextInputFormat, and etc.
>
> Thank you,
> Wei Shung



-- 
Harsh J

Re: question about file split

Posted by Harsh J <ha...@cloudera.com>.
Weishung,

For text files, this is done by the LineRecordReader.

See http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java?view=markup.
Specifically see L126-L131 and lines and the loop around L164 onwards.
These parts of the logic correlate with the logic described at
http://wiki.apache.org/hadoop/HadoopMapReduce.

On Thu, Aug 16, 2012 at 11:48 PM, Weishung Chung <we...@gmail.com> wrote:
> Hey fellow developers,
>
> I am trying to figure out in the code base, which class does the handling of
> record running across block boundary when reading a file split. I have been
> digging through LineRecordReader, FileInputFormat, TextInputFormat, and etc.
>
> Thank you,
> Wei Shung



-- 
Harsh J