You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Vibhanshu Prasad <vi...@gmail.com> on 2014/11/16 12:22:35 UTC

Regarding RecordReader of spark

Hello Everyone,

I am going through the source code of rdd and Record readers
There are found 2 classes

1. WholeTextFileRecordReader
2. WholeCombineFileRecordReader  ( extends CombineFileRecordReader )

The description of both the classes is perfectly similar.

I am not able to understand why we have 2 classes. Is
CombineFileRecordReader providing some extra advantage?

Regards
Vibhanshu

Re: Regarding RecordReader of spark

Posted by Andrew Ash <an...@andrewash.com>.
Filed as https://issues.apache.org/jira/browse/SPARK-4437

On Sun, Nov 16, 2014 at 4:49 PM, Reynold Xin <rx...@databricks.com> wrote:

> I don't think the code is immediately obvious.
>
> Davies - I think you added the code, and Josh reviewed it. Can you guys
> explain and maybe submit a patch to add more documentation on the whole
> thing?
>
> Thanks.
>
>
> On Sun, Nov 16, 2014 at 3:22 AM, Vibhanshu Prasad <
> vibhanshugsoc2@gmail.com>
> wrote:
>
> > Hello Everyone,
> >
> > I am going through the source code of rdd and Record readers
> > There are found 2 classes
> >
> > 1. WholeTextFileRecordReader
> > 2. WholeCombineFileRecordReader  ( extends CombineFileRecordReader )
> >
> > The description of both the classes is perfectly similar.
> >
> > I am not able to understand why we have 2 classes. Is
> > CombineFileRecordReader providing some extra advantage?
> >
> > Regards
> > Vibhanshu
> >
>

Re: Regarding RecordReader of spark

Posted by Reynold Xin <rx...@databricks.com>.
I don't think the code is immediately obvious.

Davies - I think you added the code, and Josh reviewed it. Can you guys
explain and maybe submit a patch to add more documentation on the whole
thing?

Thanks.


On Sun, Nov 16, 2014 at 3:22 AM, Vibhanshu Prasad <vi...@gmail.com>
wrote:

> Hello Everyone,
>
> I am going through the source code of rdd and Record readers
> There are found 2 classes
>
> 1. WholeTextFileRecordReader
> 2. WholeCombineFileRecordReader  ( extends CombineFileRecordReader )
>
> The description of both the classes is perfectly similar.
>
> I am not able to understand why we have 2 classes. Is
> CombineFileRecordReader providing some extra advantage?
>
> Regards
> Vibhanshu
>