You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Mark <st...@gmail.com> on 2011/04/07 03:30:00 UTC
Loading arbitrary objects
If I wanted to load arbitrary objects into some tuples what classes
should I be looking at? Would I need some of storage class?
For example I have data file with out that contains
org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns. I
would like to iterate over them using pig using something like:
rows = LOAD 'data' using TopKStringPatternsStorage();
Is this correct? Is there any wiki on creating storages? Is there
anything I should look out for?
Thanks for the pointers
Re: Loading arbitrary objects
Posted by Mridul Muralidharan <mr...@yahoo-inc.com>.
If the arbitrary objects you refer to fit nicely into pig's notion of
tuples/bags/maps/primitives, then you can directly use that.
Otherwise, due to limited support for complex/arbitrary objects in pig
schema (no support for something like Writable for example), you will
most probably need to treat the object's as bytearray (assuming they are
serializable) and covert to/from byte[] as part of their use. Pig
currently does not allow you to decouple an object from its serialization.
Regards,
Mridul
On Thursday 07 April 2011 07:00 AM, Mark wrote:
> If I wanted to load arbitrary objects into some tuples what classes
> should I be looking at? Would I need some of storage class?
>
> For example I have data file with out that contains
> org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns. I
> would like to iterate over them using pig using something like:
>
> rows = LOAD 'data' using TopKStringPatternsStorage();
>
> Is this correct? Is there any wiki on creating storages? Is there
> anything I should look out for?
>
> Thanks for the pointers
Re: Loading arbitrary objects
Posted by Daniel Dai <ji...@yahoo-inc.com>.
You need a LoadFunc. Check
http://pig.apache.org/docs/r0.8.0/udf.html#Load+Functions about how to
write a LoadFunc.
Daniel
On 04/06/2011 06:30 PM, Mark wrote:
> If I wanted to load arbitrary objects into some tuples what classes
> should I be looking at? Would I need some of storage class?
>
> For example I have data file with out that contains
> org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns. I
> would like to iterate over them using pig using something like:
>
> rows = LOAD 'data' using TopKStringPatternsStorage();
>
> Is this correct? Is there any wiki on creating storages? Is there
> anything I should look out for?
>
> Thanks for the pointers