You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Shravan Matthur Narayanamurthy (JIRA)" <ji...@apache.org> on 2008/09/17 22:25:46 UTC

[jira] Updated: (PIG-430) Projections in nested filter and inside foreach do not work

     [ https://issues.apache.org/jira/browse/PIG-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shravan Matthur Narayanamurthy updated PIG-430:
-----------------------------------------------

    Status: Patch Available  (was: Open)

I have fixed part of the problem that addresses the project issue. The issue mentioned in distinct still remains. The problem here is that we see that projects are being introduced into the input of distinct which creates a unique case where the projection chaining will not work. The problem is similar to the one where you can assign a nested project to a variable inside a nested block. This has been solved by replacing the nested project with a foreach statement. The solution to the distinct problem should be something similar where the input to the distinct can also be a nested project. I made a local change by replacing BaseEvalSpec by NestedProject in my code for this and it works. However, I don't want to mess up something because I am not completely aware of the side-effects of changing this in the parser. Its better if someone more comfortable with the parser took a look at this one.

Also, I think there are some issues with the parsing of nested things. I tried the following and the parser just doesn't terminate the nested block waiting and keeps waiting for more input:

A = load 'file';
B = group A by $0;
C = foreach B { C1=distinct "const"; generate C1;}

I was clueless as  to why this is happening but I tried this because I thought that the input to a nested distinct shouldn't be BaseEvalSpec which can FuncEvalSpecs and Constants. I think we need to change things a bit here.

> Projections in nested filter and inside foreach do not work
> -----------------------------------------------------------
>
>                 Key: PIG-430
>                 URL: https://issues.apache.org/jira/browse/PIG-430
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>         Attachments: 430-1.patch
>
>
> The following queries do not work:
> Nested filter:
> a = load 'studenttab10k' as (name, age, gpa);
> b = filter a by age < 20;
> c = group b by age;
> d = foreach c { cf = filter b by gpa < 3.0; cp = cf.gpa; cd = distinct cp; co = order cd by $0; generate group, flatten(co); }
> store d into 'output';
> Nested Distinct:
> a = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
> b = group a by name;
> c = foreach b { aa = distinct a.age; generate group, COUNT(aa); }
> store c into 'output';

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.