You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by souri datta <so...@gmail.com> on 2011/03/04 18:57:30 UTC

custom InputFormat class

Hi,
 Is there a good tutorial for writing custom InputFormat classes?
Any help would be greatly appreciated.

Thanks,
Souri

Re: custom InputFormat class

Posted by souri datta <so...@gmail.com>.
FYI.
Moving back to hadoop 0.20.2 solved my problem.

Thanks,
Souri

On Fri, Mar 4, 2011 at 11:27 PM, souri datta <so...@gmail.com> wrote:
> Hi,
>  Is there a good tutorial for writing custom InputFormat classes?
> Any help would be greatly appreciated.
>
> Thanks,
> Souri
>

Re: custom InputFormat class

Posted by souri datta <so...@gmail.com>.
wrong info given: I am using 0.21.0

On Fri, Mar 4, 2011 at 11:51 PM, souri datta <so...@gmail.com> wrote:
> I was already going through TextInputFormat but the corresponding
> LineRecordReader is not at all straight forward.
> Anyways,thanks for pointing it out. :)
>
> --Souri
>
> On Fri, Mar 4, 2011 at 11:41 PM, Harsh J <qw...@gmail.com> wrote:
>> It is worth reading some implementations of already existing
>> InputFormat classes, such as the simple TextInputFormat, or the
>> SequenceFileInputFormat which also has a RecordReader implementation
>> in it.
>>
>> You may find these source files in your downloaded Hadoop
>> distribution's src/ directory itself (in their appropriate packages).
>>
>> I do not know of an article that has a complete, tutorial-based
>> approach to this (yet). Perhaps others would know!
>>
>> On Fri, Mar 4, 2011 at 11:27 PM, souri datta <so...@gmail.com> wrote:
>>> Hi,
>>>  Is there a good tutorial for writing custom InputFormat classes?
>>> Any help would be greatly appreciated.
>>>
>>> Thanks,
>>> Souri
>>>
>>
>>
>>
>> --
>> Harsh J
>> www.harshj.com
>>
>

Re: custom InputFormat class

Posted by souri datta <so...@gmail.com>.
I am trying to use NLinneInputFormat , but after compiling when  I run
it, I am getting the follwoing error:
Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but
class was expected

Can someone tell me what is the reason for this? Doeas it have
anything to do with version of hadoop?
I am using 0.20

Thanks,
Souri

On Sat, Mar 5, 2011 at 12:02 AM, Harsh J <qw...@gmail.com> wrote:
> Yes, it may be too much to grasp in the first read. Reading a non
> text-based record reader implementation helps (something that has its
> own reader class, and just uses record readers to manage that). I'd
> suggested SequenceFile for this case.
>
> On Fri, Mar 4, 2011 at 11:51 PM, souri datta <so...@gmail.com> wrote:
>> I was already going through TextInputFormat but the corresponding
>> LineRecordReader is not at all straight forward.
>> Anyways,thanks for pointing it out. :)
>
> --
> Harsh J
> www.harshj.com
>

Re: custom InputFormat class

Posted by Harsh J <qw...@gmail.com>.
Yes, it may be too much to grasp in the first read. Reading a non
text-based record reader implementation helps (something that has its
own reader class, and just uses record readers to manage that). I'd
suggested SequenceFile for this case.

On Fri, Mar 4, 2011 at 11:51 PM, souri datta <so...@gmail.com> wrote:
> I was already going through TextInputFormat but the corresponding
> LineRecordReader is not at all straight forward.
> Anyways,thanks for pointing it out. :)

-- 
Harsh J
www.harshj.com

Re: custom InputFormat class

Posted by souri datta <so...@gmail.com>.
I was already going through TextInputFormat but the corresponding
LineRecordReader is not at all straight forward.
Anyways,thanks for pointing it out. :)

--Souri

On Fri, Mar 4, 2011 at 11:41 PM, Harsh J <qw...@gmail.com> wrote:
> It is worth reading some implementations of already existing
> InputFormat classes, such as the simple TextInputFormat, or the
> SequenceFileInputFormat which also has a RecordReader implementation
> in it.
>
> You may find these source files in your downloaded Hadoop
> distribution's src/ directory itself (in their appropriate packages).
>
> I do not know of an article that has a complete, tutorial-based
> approach to this (yet). Perhaps others would know!
>
> On Fri, Mar 4, 2011 at 11:27 PM, souri datta <so...@gmail.com> wrote:
>> Hi,
>>  Is there a good tutorial for writing custom InputFormat classes?
>> Any help would be greatly appreciated.
>>
>> Thanks,
>> Souri
>>
>
>
>
> --
> Harsh J
> www.harshj.com
>

Re: custom InputFormat class

Posted by Harsh J <qw...@gmail.com>.
It is worth reading some implementations of already existing
InputFormat classes, such as the simple TextInputFormat, or the
SequenceFileInputFormat which also has a RecordReader implementation
in it.

You may find these source files in your downloaded Hadoop
distribution's src/ directory itself (in their appropriate packages).

I do not know of an article that has a complete, tutorial-based
approach to this (yet). Perhaps others would know!

On Fri, Mar 4, 2011 at 11:27 PM, souri datta <so...@gmail.com> wrote:
> Hi,
>  Is there a good tutorial for writing custom InputFormat classes?
> Any help would be greatly appreciated.
>
> Thanks,
> Souri
>



-- 
Harsh J
www.harshj.com