You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Benny Sadeh <be...@gmail.com> on 2010/09/28 06:59:08 UTC

(PigJsonLoader) how to read/load json with Pig?

loading/reading json for Pig processing sounds like a common useful
functionality.

however, I have not found any implementation for such.

(and yes, I know of Elephant Bird, which reads LZO-compressed json (but not
regular json))


but I did see a reference in the "Hadoop Training: Introduction to Pig" (
http://www.cloudera.com/videos/introduction_to_pig)

within the downloadable IntroToPig.pdf, where  there is a mention of
PigJsonLoader

however, there is no such UDF within the piggybank source of
the cloudera distributed vm, or within any other piggybank jar out there
that I have seen.

so I wonder, where can I find a pig json reader/loader that can accomplish
the equivalent of: A = LOAD ‘data.json’ USING PigJsonLoader();

???


any pointeres would be greatly appreciated ...

Re: (PigJsonLoader) how to read/load json with Pig?

Posted by Ashutosh Chauhan <as...@gmail.com>.
For some reason, I always thought there is a JSONLoader in Piggybank.
Seems like there is none. Kim, it would be great if you can contribute
yours..

Ashutosh
On Tue, Sep 28, 2010 at 09:45, Kim Vogt <ki...@simplegeo.com> wrote:
> Here's mine:
>
> http://gist.github.com/601331
>
> Pretty much the same as the LZO one minus the LZO stuff.  Works with pig
> 0.7.
>
> -Kim
>
> On Mon, Sep 27, 2010 at 9:59 PM, Benny Sadeh <be...@gmail.com> wrote:
>
>> loading/reading json for Pig processing sounds like a common useful
>> functionality.
>>
>> however, I have not found any implementation for such.
>>
>> (and yes, I know of Elephant Bird, which reads LZO-compressed json (but not
>> regular json))
>>
>>
>> but I did see a reference in the "Hadoop Training: Introduction to Pig" (
>> http://www.cloudera.com/videos/introduction_to_pig)
>>
>> within the downloadable IntroToPig.pdf, where  there is a mention of
>> PigJsonLoader
>>
>> however, there is no such UDF within the piggybank source of
>> the cloudera distributed vm, or within any other piggybank jar out there
>> that I have seen.
>>
>> so I wonder, where can I find a pig json reader/loader that can accomplish
>> the equivalent of: A = LOAD ‘data.json’ USING PigJsonLoader();
>>
>> ???
>>
>>
>> any pointeres would be greatly appreciated ...
>>
>

Re: (PigJsonLoader) how to read/load json with Pig?

Posted by Kim Vogt <ki...@simplegeo.com>.
Here's mine:

http://gist.github.com/601331

Pretty much the same as the LZO one minus the LZO stuff.  Works with pig
0.7.

-Kim

On Mon, Sep 27, 2010 at 9:59 PM, Benny Sadeh <be...@gmail.com> wrote:

> loading/reading json for Pig processing sounds like a common useful
> functionality.
>
> however, I have not found any implementation for such.
>
> (and yes, I know of Elephant Bird, which reads LZO-compressed json (but not
> regular json))
>
>
> but I did see a reference in the "Hadoop Training: Introduction to Pig" (
> http://www.cloudera.com/videos/introduction_to_pig)
>
> within the downloadable IntroToPig.pdf, where  there is a mention of
> PigJsonLoader
>
> however, there is no such UDF within the piggybank source of
> the cloudera distributed vm, or within any other piggybank jar out there
> that I have seen.
>
> so I wonder, where can I find a pig json reader/loader that can accomplish
> the equivalent of: A = LOAD ‘data.json’ USING PigJsonLoader();
>
> ???
>
>
> any pointeres would be greatly appreciated ...
>