You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Daniel Dai <da...@hortonworks.com> on 2011/08/26 23:37:30 UTC

Re: Question about pig & HDFS

Pig by default use plain text file as input/output, unless you write a
custom LoadFunc/StoreFunc. There is no specific Pig storage format.
You can copy the file to local using copyToLocal. If you want to
export directly to SQL table, you need to write a StoreFunc. Pig work
on tuple rather than K,V pair, key in MR job is thrown away.

Also CC pig user group.

Daniel

On Tue, Aug 23, 2011 at 3:04 PM, Keren Ouaknine <ke...@gmail.com> wrote:
> Hello,
>
> Pig generates data to HDFS and I would like to find a way to convert it to a
> general format by either:
> 1. flatening the data (would copyToLocal work here?!)
> 2. export the data to SQL tables (or any other non specific Hadoop format)
> 3. generate K,V pairs of data (since Pig code is converted to a MR job which
> take a K,V pair)
>
> Which solution is feasible / prefered? Thanks for your help!
> Keren
>
>
> --
>  Keren Ouaknine
> Cell: +972 54 2565404
> Web: www.kereno.com
>