You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Jonathan Coveney <jc...@gmail.com> on 2012/06/21 23:34:48 UTC

Is there a loader that loads a file as a line?

It can even be a bytearray. Basically I have a bunch of files, and I want
one file -> one row. Is there an easy way to do this? Or will I need to
provide a special fileinputformat etc?

Re: Is there a loader that loads a file as a line?

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Jonathan,
        Have a look at Hadoop's WholeFileInputFormat..Might fit into
your requirements.
Regards,
    Mohammad Tariq


On Fri, Jun 22, 2012 at 3:39 AM, Prashant Kommireddi
<pr...@gmail.com> wrote:
> I think you will need to implement a RecordReader/InputFormat of your own
> for this and use it with a LoadFunc. Not sure if Hadoop has a Reader that
> you could re-use for this.
>
> How do you handle the case when a file exceeds block size?
>
> On Thu, Jun 21, 2012 at 2:34 PM, Jonathan Coveney <jc...@gmail.com>wrote:
>
>> It can even be a bytearray. Basically I have a bunch of files, and I want
>> one file -> one row. Is there an easy way to do this? Or will I need to
>> provide a special fileinputformat etc?
>>

Re: Is there a loader that loads a file as a line?

Posted by Prashant Kommireddi <pr...@gmail.com>.
I think you will need to implement a RecordReader/InputFormat of your own
for this and use it with a LoadFunc. Not sure if Hadoop has a Reader that
you could re-use for this.

How do you handle the case when a file exceeds block size?

On Thu, Jun 21, 2012 at 2:34 PM, Jonathan Coveney <jc...@gmail.com>wrote:

> It can even be a bytearray. Basically I have a bunch of files, and I want
> one file -> one row. Is there an easy way to do this? Or will I need to
> provide a special fileinputformat etc?
>