You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Gianmarco <gi...@gmail.com> on 2011/04/15 13:30:59 UTC

Pig to Hadoop complex data exchange format

Hi all,

I have a pig script that produces a complex nested data structure:

result: {child: chararray, childTraces: {action: int, time: long}, legacy:
{parent: chararray, parentTraces: {action: int, time: long}}}

I would like to post-process the output of the pig script with a mapreduce
job.
In the mapreduce job I would like to do some nested for and iterate over the
bags.

Do you have any advice on which would be the simplest way to store pig's
output in order not to have to write my own parser in mapreduce?
I thought about using JSON but it looks like there is no JSON store format
for tuples yet (I know elephantbird can store maps, but I would need to
convert my result to a nested map, which is a bit unnatural).
Avro is not an easy option on the hadoop side.

Any help would be highly appreciated.

Thanks,
--
Gianmarco De Francisci Morales

Re: Pig to Hadoop complex data exchange format

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
You can use Thrift or Protobufs using elephant-bird.

D

On Wed, May 25, 2011 at 10:29 AM, Andrew Hammond
<an...@gmail.com> wrote:
> Bump. I'm very interested in combining Avro and Pig and would greatly
> appreciate hearing about people's experiences with them.
>
> On Mon, Apr 18, 2011 at 2:23 AM, Gianmarco <gi...@gmail.com> wrote:
>
>> Last time I checked Avro on MR the integration was not yet ready.
>>
>> I see that there was a very recent release of code for it
>> http://www.tomslabs.com/
>> but I don't know how stable and tested this code is.
>>
>> Has anyone had good experiences with Avro on MR?
>>
>> Cheers,
>> --
>> Gianmarco De Francisci Morales
>>
>>
>> On Fri, Apr 15, 2011 at 20:04, Harsh J <ha...@cloudera.com> wrote:
>>
>> > Hey Gianmarco,
>> >
>> > On Fri, Apr 15, 2011 at 5:00 PM, Gianmarco <gi...@gmail.com>
>> > wrote:
>> > > Avro is not an easy option on the hadoop side.
>> >
>> > Am just a little curious on this, could you explain why you feel so
>> > about Avro on M/R?
>> >
>> > --
>> > Harsh J
>> >
>>
>

Re: Pig to Hadoop complex data exchange format

Posted by Andrew Hammond <an...@gmail.com>.
Bump. I'm very interested in combining Avro and Pig and would greatly
appreciate hearing about people's experiences with them.

On Mon, Apr 18, 2011 at 2:23 AM, Gianmarco <gi...@gmail.com> wrote:

> Last time I checked Avro on MR the integration was not yet ready.
>
> I see that there was a very recent release of code for it
> http://www.tomslabs.com/
> but I don't know how stable and tested this code is.
>
> Has anyone had good experiences with Avro on MR?
>
> Cheers,
> --
> Gianmarco De Francisci Morales
>
>
> On Fri, Apr 15, 2011 at 20:04, Harsh J <ha...@cloudera.com> wrote:
>
> > Hey Gianmarco,
> >
> > On Fri, Apr 15, 2011 at 5:00 PM, Gianmarco <gi...@gmail.com>
> > wrote:
> > > Avro is not an easy option on the hadoop side.
> >
> > Am just a little curious on this, could you explain why you feel so
> > about Avro on M/R?
> >
> > --
> > Harsh J
> >
>

Re: Pig to Hadoop complex data exchange format

Posted by Gianmarco <gi...@gmail.com>.
Last time I checked Avro on MR the integration was not yet ready.

I see that there was a very recent release of code for it
http://www.tomslabs.com/
but I don't know how stable and tested this code is.

Has anyone had good experiences with Avro on MR?

Cheers,
--
Gianmarco De Francisci Morales


On Fri, Apr 15, 2011 at 20:04, Harsh J <ha...@cloudera.com> wrote:

> Hey Gianmarco,
>
> On Fri, Apr 15, 2011 at 5:00 PM, Gianmarco <gi...@gmail.com>
> wrote:
> > Avro is not an easy option on the hadoop side.
>
> Am just a little curious on this, could you explain why you feel so
> about Avro on M/R?
>
> --
> Harsh J
>

Re: Pig to Hadoop complex data exchange format

Posted by Harsh J <ha...@cloudera.com>.
Hey Gianmarco,

On Fri, Apr 15, 2011 at 5:00 PM, Gianmarco <gi...@gmail.com> wrote:
> Avro is not an easy option on the hadoop side.

Am just a little curious on this, could you explain why you feel so
about Avro on M/R?

-- 
Harsh J