You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Eyal Allweil <ey...@yahoo.com.INVALID> on 2016/07/04 14:05:45 UTC
Re: Schema issue while storing multiple pig outputs using
CSVExcelStorage
I can replicate these results on Pig 0.14.
Did anyone open a Jira issue for this?
On Thursday, March 10, 2016 12:24 PM, Sarath Sasidharan <ss...@bol.com> wrote:
Hi All,
I have a script which stores 2 relations with different schema using CSVExcelStorage.
The issue which i see is that the script picks up the last store function and takes the schema in that and puts it for all store functions , overriding the previous store schemas.Is this a known issue and is there a fix for this ?
My Sample Script Looks like this :--
=============================================================
masterInput = load 'hbase://xyz' using org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'f:a,f:b,f:c,f:d')
as (a,b,c,d);
input2 = foreach masterInput
generate
a,b;
input3 = foreach masterInput
generate
c,d;
store input2 into '/dir/ab'
using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 'UNIX', 'WRITE_OUTPUT_HEADER');
store input3 into '/dir/cd'
using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 'UNIX', 'WRITE_OUTPUT_HEADER');
=============================================================
Expected Output :
file 1 file 2
a,b c,d
10,20 30,40
Actual Output :
file 1 file 2
c,d c,d
10,20 30,40
Thanks and Regards,
Sarath Sasidharan
Re: Schema issue while storing multiple pig outputs using CSVExcelStorage
Posted by Rohini Palaniswamy <ro...@gmail.com>.
Can you try in Pig 0.16? Niels fixed this in
https://issues.apache.org/jira/browse/PIG-4689
On Mon, Jul 4, 2016 at 7:05 AM, Eyal Allweil <eyal_allweil@yahoo.com.invalid
> wrote:
> I can replicate these results on Pig 0.14.
> Did anyone open a Jira issue for this?
>
>
> On Thursday, March 10, 2016 12:24 PM, Sarath Sasidharan <
> ssasidharan@bol.com> wrote:
>
>
> Hi All,
>
> I have a script which stores 2 relations with different schema using
> CSVExcelStorage.
>
> The issue which i see is that the script picks up the last store function
> and takes the schema in that and puts it for all store functions ,
> overriding the previous store schemas.Is this a known issue and is there a
> fix for this ?
>
> My Sample Script Looks like this :--
>
> =============================================================
>
> masterInput = load 'hbase://xyz' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
> 'f:a,f:b,f:c,f:d')
> as (a,b,c,d);
>
> input2 = foreach masterInput
> generate
> a,b;
>
> input3 = foreach masterInput
> generate
> c,d;
>
> store input2 into '/dir/ab'
> using
> org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE',
> 'UNIX', 'WRITE_OUTPUT_HEADER');
>
> store input3 into '/dir/cd'
> using
> org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE',
> 'UNIX', 'WRITE_OUTPUT_HEADER');
>
> =============================================================
>
> Expected Output :
>
> file 1 file 2
>
> a,b c,d
> 10,20 30,40
>
>
> Actual Output :
>
> file 1 file 2
> c,d c,d
> 10,20 30,40
>
> Thanks and Regards,
>
> Sarath Sasidharan
>
>
>
Re: Schema issue while storing multiple pig outputs using
CSVExcelStorage
Posted by Sarath Sasidharan <ss...@bol.com>.
Hi Eyal,
1. I have created a ticket : PIG-4943<https://issues.apache.org/jira/browse/PIG-4943>
Thanks and Regards,
Sarath
From: Eyal Allweil <ey...@yahoo.com>
Reply-To: Eyal Allweil <ey...@yahoo.com>
Date: Monday 4 July 2016 at 16:05
To: "user@pig.apache.org" <us...@pig.apache.org>, Sarath Sasidharan <ss...@bol.com>
Subject: Re: Schema issue while storing multiple pig outputs using CSVExcelStorage
I can replicate these results on Pig 0.14.
Did anyone open a Jira issue for this?
On Thursday, March 10, 2016 12:24 PM, Sarath Sasidharan <ss...@bol.com> wrote:
Hi All,
I have a script which stores 2 relations with different schema using CSVExcelStorage.
The issue which i see is that the script picks up the last store function and takes the schema in that and puts it for all store functions , overriding the previous store schemas.Is this a known issue and is there a fix for this ?
My Sample Script Looks like this :--
=============================================================
masterInput = load 'hbase://xyz' using org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'f:a,f:b,f:c,f:d')
as (a,b,c,d);
input2 = foreach masterInput
generate
a,b;
input3 = foreach masterInput
generate
c,d;
store input2 into '/dir/ab'
using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 'UNIX', 'WRITE_OUTPUT_HEADER');
store input3 into '/dir/cd'
using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 'UNIX', 'WRITE_OUTPUT_HEADER');
=============================================================
Expected Output :
file 1 file 2
a,b c,d
10,20 30,40
Actual Output :
file 1 file 2
c,d c,d
10,20 30,40
Thanks and Regards,
Sarath Sasidharan