You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2011/05/13 00:33:47 UTC

[jira] [Commented] (PIG-2067) FilterLogicExpressionSimplifier mess up uid in some cases

    [ https://issues.apache.org/jira/browse/PIG-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032722#comment-13032722 ] 

Daniel Dai commented on PIG-2067:
---------------------------------

The logical plan after LogicExpressionSimplifier is wrong:

E: (Name: LOStore Schema: group#29:bytearray,A#30:bag{#240:tuple(cookie#10:bytearray)},B#32:bag{#241:tuple(cookie#11:bytearray)})
|
|---E: (Name: LOFilter Schema: group#29:bytearray,A#30:bag{#240:tuple(cookie#10:bytearray)},B#32:bag{#241:tuple(cookie#11:bytearray)})
    |   |
    |   (Name: And Type: boolean Uid: 41)
    |   |
    |   |---(Name: GreaterThan Type: boolean Uid: 37)
    |   |   |
    |   |   |---(Name: UserFunc(org.apache.pig.builtin.COUNT) Type: long Uid: 34)
    |   |   |   |
    |   |   |   |---B:(Name: Project Type: bag Uid: 32 Input: 0 Column: 2)
    |   |   |
    |   |   |---(Name: Cast Type: long Uid: 35)
    |   |       |
    |   |       |---(Name: Constant Type: int Uid: 35)
    |   |
    |   |---(Name: GreaterThan Type: boolean Uid: 40)
    |       |
    |       |---(Name: Cast Type: int Uid: 29)
    |       |   |
    |       |   |---group:(Name: Project Type: bytearray Uid: 29 Input: 0 Column: 0)
    |       |
    |       |---(Name: Constant Type: int Uid: 39)
    |
    |---C: (Name: LOCogroup Schema: group#29:bytearray,A#30:bag{#240:tuple(cookie#10:bytearray)},B#32:bag{#241:tuple(cookie#11:bytearray)})
        |   |
        |   cookie:(Name: Project Type: bytearray Uid: 10 Input: 0 Column: 0)
        |   |
        |   cookie:(Name: Project Type: bytearray Uid: 11 Input: 1 Column: 0)
        |
        |---A: (Name: LOLoad Schema: cookie#10:bytearray)RequiredFields:null
        |
        |---B: (Name: LOLoad Schema: cookie#11:bytearray)RequiredFields:null

One branch of GreaterThan is on group rather than A.

> FilterLogicExpressionSimplifier mess up uid in some cases
> ---------------------------------------------------------
>
>                 Key: PIG-2067
>                 URL: https://issues.apache.org/jira/browse/PIG-2067
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.1, 0.9.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.1, 0.9.0
>
>
> The following script produce wrong result:
> {code}
> A = load 'a.dat' as (cookie);
> B = load 'b.dat' as (cookie);
> C = cogroup A by cookie, B by cookie;
> E = filter C by COUNT(B)>0 AND group>0;
> explain E;
> {code}
> a.dat:
> 1       1
> 2       2
> 3       3
> 4       4
> 5       5
> 6       6
> 7       7
> b.dat:
> 3       3
> 4       4
> 5       5
> 6       6
> 7       7
> 8       8
> Expected output:
> (3,{(3)},{(3)})
> (4,{(4)},{(4)})
> (5,{(5)},{(5)})
> (6,{(6)},{(6)})
> (7,{(7)},{(7)})
> We get:
> (3,{(3)},{(3)})
> (4,{(4)},{(4)})
> (5,{(5)},{(5)})
> (6,{(6)},{(6)})
> (7,{(7)},{(7)})
> (8,{},{(8)})

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira