You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Cheolsoo Park (JIRA)" <ji...@apache.org> on 2013/12/23 22:17:50 UTC

[jira] [Commented] (PIG-3581) Incorrect scope resolution with nested foreach

    [ https://issues.apache.org/jira/browse/PIG-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855935#comment-13855935 ] 

Cheolsoo Park commented on PIG-3581:
------------------------------------

[~aniket486], I think this patch introduced a regression. Consider the following query-
{code}
a = LOAD 'foo' AS (x:int, y:chararray);
b = GROUP a BY x;
c = FOREACH b {
    expr = 'bar';
    filtered = FILTER a BY y == expr;
    GENERATE COUNT(filtered);
}
DESCRIBE c;
{code}
This used to work in 0.11 but no longer works in trunk. It looks like 'expr' used to be resolved to a scalar expression ('bar'), but it's not the case anymore. 

My question are,
1. Is it supported to define a local scalar expression inside a nested foreach? e.g. expr = 'bar';
2. If so, can you fix the regression?

> Incorrect scope resolution with nested foreach
> ----------------------------------------------
>
>                 Key: PIG-3581
>                 URL: https://issues.apache.org/jira/browse/PIG-3581
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Venu Satuluri
>            Assignee: Aniket Mokashi
>         Attachments: PIG-3581-1.patch, PIG-3581-2.patch
>
>
> Consider the following script:
> {code}
> A = LOAD 'test_data' AS (a: int, b: int);
> C = FOREACH A GENERATE *;
> B = FOREACH (GROUP A BY a) {
> 	C = FILTER A BY b % 2 == 0;
> 	D = FILTER A BY b % 2 == 1;
> 	GENERATE group AS a, A.b AS every, C.b AS even, D.b AS odd;
> };
> DESCRIBE B;
> {code}
> Notice that C is defined both inside the nested foreach as well as outside. I would expect that in the GENERATE inside the nested FOREACH, the C that is used will be the one that is defined inside. If that is not so, I think at least a warning is due.
> However, currently Pig silently assumes that the C you mean one is the one that is defined *outside* the nested FOREACH.
> Hence, the result of "DESCRIBE B" looks as follows:
> {code}
> B: {
>     a: int,
>     every: {
>         (
>             b: int
>         )
>     },
>     even: int,
>     odd: {
>         (
>             b: int
>         )
>     }
> }
> {code}
> If I remove the definition of C that is outside the foreach, then I get the following for "DESCRIBE B":
> {code}
> B: {
>     a: int,
>     every: {
>         (
>             b: int
>         )
>     },
>     even: {
>         (
>             b: int
>         )
>     },
>     odd: {
>         (
>             b: int
>         )
>     }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)