You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Vivek Padmanabhan (Commented) (JIRA)" <ji...@apache.org> on 2011/09/30 14:29:45 UTC

[jira] [Commented] (PIG-2119) DuplicateForEachColumnRewrite makes assumptions about the position of LOGGenerate in the plan

    [ https://issues.apache.org/jira/browse/PIG-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118016#comment-13118016 ] 

Vivek Padmanabhan commented on PIG-2119:
----------------------------------------

Faced this issue with the below script;
{code}
A = load '3char_1long_tab' as (f1:chararray, f2:chararray, f3:chararray,ct:long);
B = GROUP A  BY f1;
C =    FOREACH B {
        zip_ordered = ORDER A BY f3 ASC; 
        GENERATE
                FLATTEN(group) AS f1,	
                A.(f3, ct),
		--COUNT(zip_ordered),
                SUM(A.ct) AS total;
  };

dump C;
{code}

The zip_ordered is an accident and not used, but Pig 0.8 silently ignores this while Pig 0.9 throws exception.
I believe the affect version should be 0.9

                
> DuplicateForEachColumnRewrite makes assumptions about the position of LOGGenerate in the plan
> ---------------------------------------------------------------------------------------------
>
>                 Key: PIG-2119
>                 URL: https://issues.apache.org/jira/browse/PIG-2119
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Gianmarco De Francisci Morales
>
> The input:
> {code}
> grunt> cat b.txt
> a       11
> b       3
> c       10
> a       12
> b       10
> c       15
> {code}
> The script:
> {code}
> a = load 'b.txt' AS (id:chararray, num:int);
> b = group a by id;
> c = foreach b { 
>   d = order a by num DESC;
>   n = COUNT(a);
>   e = limit d 1;
>   generate n;
> }
> {code}
> The exception:
> {code}
> Caused by: java.lang.ClassCastException: org.apache.pig.newplan.logical.relational.LOLimit cannot be cast to org.apache.pig.newplan.logical.relational.LOGenerate
>         at org.apache.pig.newplan.logical.rules.DuplicateForEachColumnRewrite$DuplicateForEachColumnRewriteTransformer.check(DuplicateForEachColumnRewrite.java:87)
>         at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:108)
> {code}
> I know the script is a bit pointless, but I was just testing and modifying the script bit by bit.
> If I remove the limit in any case I get the same exception but with LOSort.
> The problem, I think, is that the rule assumes there is only 1 sink in the nested block and that this sink is a LOGenerate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira