You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2011/05/13 01:06:47 UTC
[jira] [Commented] (PIG-2067) FilterLogicExpressionSimplifier
removed some branches in some cases
[ https://issues.apache.org/jira/browse/PIG-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032739#comment-13032739 ]
Daniel Dai commented on PIG-2067:
---------------------------------
Actually it erroneously remove one branch:
{code}
#-----------------------------------------------
# New Logical Plan:
#-----------------------------------------------
E: (Name: LOStore Schema: group#30:bytearray,A#31:bag{#242:tuple(cookie#10:bytearray)},B#33:bag{#243:tuple(cookie#11:bytearray)})
|
|---E: (Name: LOFilter Schema: group#30:bytearray,A#31:bag{#242:tuple(cookie#10:bytearray)},B#33:bag{#243:tuple(cookie#11:bytearray)})
| |
| (Name: GreaterThan Type: boolean Uid: 38)
| |
| |---(Name: UserFunc(org.apache.pig.builtin.COUNT) Type: long Uid: 35)
| | |
| | |---B:(Name: Project Type: bag Uid: 33 Input: 0 Column: 2)
| |
| |---(Name: Cast Type: long Uid: 36)
| |
| |---(Name: Constant Type: int Uid: 36)
|
|---C: (Name: LOCogroup Schema: group#30:bytearray,A#31:bag{#242:tuple(cookie#10:bytearray)},B#33:bag{#243:tuple(cookie#11:bytearray)})
| |
| cookie:(Name: Project Type: bytearray Uid: 10 Input: 0 Column: 0)
| |
| cookie:(Name: Project Type: bytearray Uid: 11 Input: 1 Column: 0)
|
|---A: (Name: LOLoad Schema: cookie#10:bytearray)RequiredFields:null
|
|---B: (Name: LOLoad Schema: cookie#11:bytearray)RequiredFields:null
{code}
> FilterLogicExpressionSimplifier removed some branches in some cases
> -------------------------------------------------------------------
>
> Key: PIG-2067
> URL: https://issues.apache.org/jira/browse/PIG-2067
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.1, 0.9.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.8.1, 0.9.0
>
>
> The following script produce wrong result:
> {code}
> A = load 'a.dat' as (cookie);
> B = load 'b.dat' as (cookie);
> C = cogroup A by cookie, B by cookie;
> E = filter C by COUNT(B)>0 AND COUNT(A)>0;
> explain E;
> {code}
> a.dat:
> 1 1
> 2 2
> 3 3
> 4 4
> 5 5
> 6 6
> 7 7
> b.dat:
> 3 3
> 4 4
> 5 5
> 6 6
> 7 7
> 8 8
> Expected output:
> (3,{(3)},{(3)})
> (4,{(4)},{(4)})
> (5,{(5)},{(5)})
> (6,{(6)},{(6)})
> (7,{(7)},{(7)})
> We get:
> (3,{(3)},{(3)})
> (4,{(4)},{(4)})
> (5,{(5)},{(5)})
> (6,{(6)},{(6)})
> (7,{(7)},{(7)})
> (8,{},{(8)})
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira