You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Vivek Padmanabhan (Commented) (JIRA)" <ji...@apache.org> on 2011/11/23 09:27:40 UTC

[jira] [Commented] (PIG-2385) Store statements not getting processed

    [ https://issues.apache.org/jira/browse/PIG-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155748#comment-13155748 ] 

Vivek Padmanabhan commented on PIG-2385:
----------------------------------------

I think that this issue is introduced as part of new parser changes in Pig 0.9. 
As per my analysis, the Scalar ref used,"Z1.count" introduces another additional LOStore with a tmp location.
While processing for STORE D, it considers this extra LOStore also and thinks that it has processed all store statements.
Hence,the store operator for the alias 'D' is skipped (this is happening at PigServer.Graph.skipStores() ).

The above script, could be used to replicate the scenario without multiquery, if we use PigServer to register and execute the queries.
                
> Store statements not getting processed
> --------------------------------------
>
>                 Key: PIG-2385
>                 URL: https://issues.apache.org/jira/browse/PIG-2385
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.1
>            Reporter: Vivek Padmanabhan
>            Priority: Critical
>
> The actual script in which we got this issue is pretty big and complex. The script has total 4 STORE statements and one of the STORE statement is not getting executed.
> The script executes 3 sets of jobs (excluding one STORE which is not getting executed) consisting of 10, 11 and 19 jobs.
> The below script could be used to illustrate the issue but with Multiquery turned off;
> {code}
> A = LOAD 'input1' as (f1:chararray,f2:chararray,f3:chararray);
> Z = group A all;
> Z1 = foreach Z generate COUNT(A) as count;
> B = foreach A generate f1,f2,f3,(100-Z1.count) as diff;
> C = order B by diff;
> STORE C INTO 'output/C_out';
> D = DISTINCT C ;
> store D into 'output/F_out';
> {code}
> For this script, if run with Multiquery turned off, the Store command for D is not getting executed.
> I can see that the statements are getting parsed and LOStore created for D , but still, it is not getting executed.
> The above script works fine with Pig 0.8.(This issue still exists in Trunk as well)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira