You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Mix Nin <pi...@gmail.com> on 2013/03/07 01:41:38 UTC
FLATTEN is not working
I have a file with below data
xxxxx 11,22,33 44,55,66 77,88,99
I wrote below PIG script
X= LOAD '/user/lnindrakrishna/tmp/ExpTag.txt' AS (id :chararray,qc
:chararray ,qt :chararray ,qe :chararray );
Y = Foreach X generate id, STRSPLIT(qc,',') AS split_qc , STRSPLIT(qt,',')
AS split_qt, STRSPLIT(qe,',') AS split_qe;;
Z = foreach Y generate id, FLATTEN(TOBAG(split_qc));
I expected output as follows:
xxxxx 11
xxxxx 22
xxxxx 33
But the above script is producing output as follows
(xxxxx,11,22,33)
FLATTEN is not actually flattening the bag of tuple. Any inputs here?
- Thanks
Re: FLATTEN is not working
Posted by Mix Nin <pi...@gmail.com>.
I used below script and got the desired output. Thanks for the reply
A =foreach Z generate $0 as id, FLATTEN(TOBAG(*)) as value;
I have another question
Currently the input is as below
xxxxx 11,22,33 44,55,66 77,88,99
Suppose if input is as below
xxxxx 11,22,33 44,55,66 77,88,99
yyyyy 12,23 34,45 56,67
zzzzz 1,2,3,4 5,6,7,8,9 66,77,88,99
And the output needs to be as follows
xxxx 11 44 77
xxxx 22 55 88
xxxx 33 66 99
yyyy 12 34 56
yyyy 23 45 67
zzzz 1 5 66
zzzz 2 6 77
zzzz 3 7 88
zzzz 4 8 99
So basically, input can have variable values in each filed. How can we
replace the script?
On Thu, Mar 7, 2013 at 7:03 AM, Mix Nin <pi...@gmail.com> wrote:
> Hi Harsha,
>
> I am getting output as below with the new script. It is not transposed
>
> (xxxxx,(11,44,77),(22,55,88),(33,66,99))
>
>
> Also , there is no guarantee that in input that there would be only 3
> values in each field separated by comma(,). There can be variable number of
> values.
>
> Thanks
>
>
>
Re: FLATTEN is not working
Posted by Mix Nin <pi...@gmail.com>.
Hi Harsha,
I am getting output as below with the new script. It is not transposed
(xxxxx,(11,44,77),(22,55,88),(33,66,99))
Also , there is no guarantee that in input that there would be only 3
values in each field separated by comma(,). There can be variable number of
values.
Thanks
Re: FLATTEN is not working
Posted by Harsha <ha...@defun.org>.
I can think off doing some thing on these lines but there might be a better way.
Z = foreach Y generate id, TOTUPLE(split_qc.$0,split_qt.$0,split_qe.$0),TOTUPLE(split_qc.$1,split_qt.$1,split_qe.$1),TOTUPLE(split_qc.$2,split_qt.$2,split_qe.$2);
A = foreach Z generate $0, flatten(TOBAG($1,$2,$3));
--
Harsha
On Wednesday, March 6, 2013 at 5:46 PM, Mix Nin wrote:
> Harsha, Thanks for the reply. Suppose if I want to see output as follows
> xxxxx 11 44 77
> xxxxx 22 55 88
> xxxxx 33 66 99
>
> How would the script be written
>
>
> On Wed, Mar 6, 2013 at 5:29 PM, Harsha <harsha@defun.org (mailto:harsha@defun.org)> wrote:
>
> > Hi Mix,
> > You are doing a TOBAG on a tuple which will put it as
> > {((11,22,33))}.
> > flatten the tuple before doing the TOBAG.
> > Z = foreach Y GENERATE id ,flatten(split_qc);
> > A = foreach Z generate $0, flatten(TOBAG($1,$2,$3));
> > --
> > Harsha
> >
> >
> > On Wednesday, March 6, 2013 at 4:41 PM, Mix Nin wrote:
> >
> > > I have a file with below data
> > >
> > > xxxxx 11,22,33 44,55,66 77,88,99
> > >
> > > I wrote below PIG script
> > >
> > > X= LOAD '/user/lnindrakrishna/tmp/ExpTag.txt' AS (id :chararray,qc
> > > :chararray ,qt :chararray ,qe :chararray );
> > >
> > > Y = Foreach X generate id, STRSPLIT(qc,',') AS split_qc ,
> > STRSPLIT(qt,',')
> > > AS split_qt, STRSPLIT(qe,',') AS split_qe;;
> > >
> > > Z = foreach Y generate id, FLATTEN(TOBAG(split_qc));
> > >
> > > I expected output as follows:
> > >
> > > xxxxx 11
> > > xxxxx 22
> > > xxxxx 33
> > >
> > > But the above script is producing output as follows
> > >
> > > (xxxxx,11,22,33)
> > >
> > > FLATTEN is not actually flattening the bag of tuple. Any inputs here?
> > >
> > > - Thanks
Re: FLATTEN is not working
Posted by Mix Nin <pi...@gmail.com>.
Harsha, Thanks for the reply. Suppose if I want to see output as follows
xxxxx 11 44 77
xxxxx 22 55 88
xxxxx 33 66 99
How would the script be written
On Wed, Mar 6, 2013 at 5:29 PM, Harsha <ha...@defun.org> wrote:
> Hi Mix,
> You are doing a TOBAG on a tuple which will put it as
> {((11,22,33))}.
> flatten the tuple before doing the TOBAG.
> Z = foreach Y GENERATE id ,flatten(split_qc);
> A = foreach Z generate $0, flatten(TOBAG($1,$2,$3));
> --
> Harsha
>
>
> On Wednesday, March 6, 2013 at 4:41 PM, Mix Nin wrote:
>
> > I have a file with below data
> >
> > xxxxx 11,22,33 44,55,66 77,88,99
> >
> > I wrote below PIG script
> >
> > X= LOAD '/user/lnindrakrishna/tmp/ExpTag.txt' AS (id :chararray,qc
> > :chararray ,qt :chararray ,qe :chararray );
> >
> > Y = Foreach X generate id, STRSPLIT(qc,',') AS split_qc ,
> STRSPLIT(qt,',')
> > AS split_qt, STRSPLIT(qe,',') AS split_qe;;
> >
> > Z = foreach Y generate id, FLATTEN(TOBAG(split_qc));
> >
> > I expected output as follows:
> >
> > xxxxx 11
> > xxxxx 22
> > xxxxx 33
> >
> > But the above script is producing output as follows
> >
> > (xxxxx,11,22,33)
> >
> > FLATTEN is not actually flattening the bag of tuple. Any inputs here?
> >
> > - Thanks
>
>
Re: FLATTEN is not working
Posted by Harsha <ha...@defun.org>.
Hi Mix,
You are doing a TOBAG on a tuple which will put it as {((11,22,33))}.
flatten the tuple before doing the TOBAG.
Z = foreach Y GENERATE id ,flatten(split_qc);
A = foreach Z generate $0, flatten(TOBAG($1,$2,$3));
--
Harsha
On Wednesday, March 6, 2013 at 4:41 PM, Mix Nin wrote:
> I have a file with below data
>
> xxxxx 11,22,33 44,55,66 77,88,99
>
> I wrote below PIG script
>
> X= LOAD '/user/lnindrakrishna/tmp/ExpTag.txt' AS (id :chararray,qc
> :chararray ,qt :chararray ,qe :chararray );
>
> Y = Foreach X generate id, STRSPLIT(qc,',') AS split_qc , STRSPLIT(qt,',')
> AS split_qt, STRSPLIT(qe,',') AS split_qe;;
>
> Z = foreach Y generate id, FLATTEN(TOBAG(split_qc));
>
> I expected output as follows:
>
> xxxxx 11
> xxxxx 22
> xxxxx 33
>
> But the above script is producing output as follows
>
> (xxxxx,11,22,33)
>
> FLATTEN is not actually flattening the bag of tuple. Any inputs here?
>
> - Thanks