You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Vivek Padmanabhan (Commented) (JIRA)" <ji...@apache.org> on 2011/09/30 14:29:45 UTC
[jira] [Commented] (PIG-2119) DuplicateForEachColumnRewrite makes
assumptions about the position of LOGGenerate in the plan
[ https://issues.apache.org/jira/browse/PIG-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118016#comment-13118016 ]
Vivek Padmanabhan commented on PIG-2119:
----------------------------------------
Faced this issue with the below script;
{code}
A = load '3char_1long_tab' as (f1:chararray, f2:chararray, f3:chararray,ct:long);
B = GROUP A BY f1;
C = FOREACH B {
zip_ordered = ORDER A BY f3 ASC;
GENERATE
FLATTEN(group) AS f1,
A.(f3, ct),
--COUNT(zip_ordered),
SUM(A.ct) AS total;
};
dump C;
{code}
The zip_ordered is an accident and not used, but Pig 0.8 silently ignores this while Pig 0.9 throws exception.
I believe the affect version should be 0.9
> DuplicateForEachColumnRewrite makes assumptions about the position of LOGGenerate in the plan
> ---------------------------------------------------------------------------------------------
>
> Key: PIG-2119
> URL: https://issues.apache.org/jira/browse/PIG-2119
> Project: Pig
> Issue Type: Bug
> Reporter: Gianmarco De Francisci Morales
>
> The input:
> {code}
> grunt> cat b.txt
> a 11
> b 3
> c 10
> a 12
> b 10
> c 15
> {code}
> The script:
> {code}
> a = load 'b.txt' AS (id:chararray, num:int);
> b = group a by id;
> c = foreach b {
> d = order a by num DESC;
> n = COUNT(a);
> e = limit d 1;
> generate n;
> }
> {code}
> The exception:
> {code}
> Caused by: java.lang.ClassCastException: org.apache.pig.newplan.logical.relational.LOLimit cannot be cast to org.apache.pig.newplan.logical.relational.LOGenerate
> at org.apache.pig.newplan.logical.rules.DuplicateForEachColumnRewrite$DuplicateForEachColumnRewriteTransformer.check(DuplicateForEachColumnRewrite.java:87)
> at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:108)
> {code}
> I know the script is a bit pointless, but I was just testing and modifying the script bit by bit.
> If I remove the limit in any case I get the same exception but with LOSort.
> The problem, I think, is that the rule assumes there is only 1 sink in the nested block and that this sink is a LOGenerate.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira