You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Siddhi Borkar <si...@persistent.co.in> on 2013/09/26 09:59:38 UTC
Loading a custom schema
Hi,
I am trying to load a tsv file using PigStorage
input_data = load 'input.tsv' using PigStorage('\t','-schema');
This loads the tsv file as per the .pig_schema file present in the input folder.
Is there any way to load the schema from a custom path? For ex, say I have the schema saved in a different directory and a different name (not.pig_schema) than the directory where input file is located.
Thanks,
Siddhi
RE: Loading a custom schema
Posted by Siddhi Borkar <si...@persistent.co.in>.
Thanks Prashant, Will try this out.
-----Original Message-----
From: Prashant Kommireddi [mailto:prash1784@gmail.com]
Sent: Thursday, September 26, 2013 1:47 PM
To: user@pig.apache.org
Subject: Re: Loading a custom schema
Hi Siddhi,
PigStorage by default looks for ".pig_schema" under the input dir. If you would like to use a different filename, you would have to override PigStorage.getSchema(String location, Job job) and define a custom JsonMetadata object. You might want to start here.
Using a schema file location completely outside of data files would involve passing the appropriate "schema path" location to JsonMetadata.getSchema.
On Thu, Sep 26, 2013 at 12:59 AM, Siddhi Borkar < siddhi_borkar@persistent.co.in> wrote:
> Hi,
>
> I am trying to load a tsv file using PigStorage
>
> input_data = load 'input.tsv' using PigStorage('\t','-schema');
>
> This loads the tsv file as per the .pig_schema file present in the
> input folder.
>
> Is there any way to load the schema from a custom path? For ex, say I
> have the schema saved in a different directory and a different name
> (not.pig_schema) than the directory where input file is located.
>
> Thanks,
> Siddhi
>
Re: Loading a custom schema
Posted by Prashant Kommireddi <pr...@gmail.com>.
Hi Siddhi,
PigStorage by default looks for ".pig_schema" under the input dir. If you
would like to use a different filename, you would have to override
PigStorage.getSchema(String location, Job job) and define a custom
JsonMetadata object. You might want to start here.
Using a schema file location completely outside of data files would involve
passing the appropriate "schema path" location to JsonMetadata.getSchema.
On Thu, Sep 26, 2013 at 12:59 AM, Siddhi Borkar <
siddhi_borkar@persistent.co.in> wrote:
> Hi,
>
> I am trying to load a tsv file using PigStorage
>
> input_data = load 'input.tsv' using PigStorage('\t','-schema');
>
> This loads the tsv file as per the .pig_schema file present in the input
> folder.
>
> Is there any way to load the schema from a custom path? For ex, say I have
> the schema saved in a different directory and a different name
> (not.pig_schema) than the directory where input file is located.
>
> Thanks,
> Siddhi
>