You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Arnab Guin (JIRA)" <ji...@apache.org> on 2012/12/21 00:37:13 UTC
[jira] [Commented] (PIG-3060) FLATTEN in nested foreach fails when
the input contains an empty bag
[ https://issues.apache.org/jira/browse/PIG-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537492#comment-13537492 ]
Arnab Guin commented on PIG-3060:
---------------------------------
Not sure the issue has been fixed on the trunk between the filed date and current date. I tried out the example on the latest trunk.
Here is my input file: (with empty bag)
flatten.txt:
2 {}
3 {}
4 {}
I essentially used the attached program with some minor modifications (like adding dump, load, store etc.). The number of bags is 0 as expected.
flatten.pig:
A = load './flatten.txt' using PigStorage(' ') as (a0:int, a1:bag{(t:chararray)});
B = group A by $0;
dump B;
C = foreach B {
c1 = foreach A generate FLATTEN(a1);
generate COUNT(c1);
};
dump B;
dump C;
shell> pig -x local -f flatten.pig
(2,{(2,{})})
(3,{(3,{})})
(4,{(4,{})})
(0)
(0)
(0)
With another example where the bag is non-empty:
flatten.txt:
2 {(a),(b),(c)}
3 {(x),(y),(z)}
shell> pig -x local -f flatten.pig
(2,{(2,{(a),(b),(c)})})
(3,{(3,{(x),(y),(z)})})
(3)
(3)
Did I get something wrong?
-Arnab
> FLATTEN in nested foreach fails when the input contains an empty bag
> --------------------------------------------------------------------
>
> Key: PIG-3060
> URL: https://issues.apache.org/jira/browse/PIG-3060
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Reporter: Youngwook Kim
>
> FLATTEN inside a foreach statement produces wrong results, if the input contains an empty bag.
> {code}
> A = load 'flatten.txt' as (a0:int, a1:bag{(t:chararray)});
> B = group A by a0;
> C = foreach B {
> c1 = foreach A generate FLATTEN(a1);
> generate COUNT(c1);
> };
> {code}
> The easy workaround is to filter out empty bags.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira