You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2011/01/26 22:32:47 UTC

[jira] Commented: (PIG-1776) changing statement corresponding to alias after explain , then doing dump gives incorrect result

    [ https://issues.apache.org/jira/browse/PIG-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987233#action_12987233 ] 

Thejas M Nair commented on PIG-1776:
------------------------------------

(adding more explanation to previous comment)
The root cause of the problem was that the UDFContext objects were not reset between plan regenerations. 
The explain command in the query set the requiredfields property of the load function to require only first two fields. When the plan was regenerated during the dump command, the optimizer rules figured that all columns in load statement are required, and it did not set the requiredfields property. As a result, the load projected only the first two columns and the 3rd column was null.
To fix this bug, each time a clone of the logical plan is created for regenerating the plan, the UDFContext is being reset. 

> changing statement corresponding to alias after explain , then doing dump gives incorrect result
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1776
>                 URL: https://issues.apache.org/jira/browse/PIG-1776
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1776.1.patch
>
>
> {code}
> grunt> a = load '/tmp/t2.txt' as (str:chararray, num1:int, alph : chararray);
> grunt> dump a;
> (ABC,1,a)
> (ABC,1,b)
> (ABC,1,a)
> (ABC,2,b)
> (DEF,1,d)
> (XYZ,1,x)
> grunt> c = foreach b  generate group.str, group.$1, COUNT(a.alph) ;          
> grunt> dump c; -- gives correct results
> (ABC,1,3)
> (ABC,2,1)
> (DEF,1,1)
> (XYZ,1,1)
> /* but dumping c after following steps gives incorrect results */
> grunt> c = foreach b  generate group.$0 , (CHARARRAY)group.$1;                                                                                 
> grunt> explain c;
> ...
> ...
> grunt> c = foreach b  generate group.str, group.$1, COUNT(a.alph) ;
> grunt> dump c;             
> (ABC,1,0)
> (ABC,2,0)
> (DEF,1,0)
> (XYZ,1,0)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.