You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Viraj Bhat (JIRA)" <ji...@apache.org> on 2009/11/05 03:01:32 UTC
[jira] Commented: (PIG-1060) MultiQuery optimization throws error
for multi-level splits
[ https://issues.apache.org/jira/browse/PIG-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773744#action_12773744 ]
Viraj Bhat commented on PIG-1060:
---------------------------------
Hi Ankur and Richard,
I have a script which demonstrates a similar problem, but can be solved by using the -M option. This script can reproduce the problem even without the UNION operator , but it has properties 1 and 2 of the original problem description.
Try commenting out the F alias. It works fine.
{code}
ORGINALDATA = load '/user/viraj/somedata.txt' using PigStorage() as (col1, col2, col3, col4, col5, col6, col7, col8);
--Check data
A = foreach ORGINALDATA generate col1, col2, col3, col4, col5, col6;
B = group A all;
C = foreach B generate COUNT(A);
store C into '/user/viraj/result1';
D = filter A by (col1 == col2) or (col1 == col3);
E = group D all;
F = foreach E generate COUNT(D);
--try commenting F
store F into '/user/viraj/result2';
G = filter D by (col4 == col5) ;
H = group G all;
I = foreach H generate COUNT(G);
store I into '/user/viraj/result3';
J = filter G by (((col6 == 'm') or (col6 == 'M')) and (col6 == 1)) or (((col6 == 'f') or (col6 == 'F')) and (col6 == 0)) or ((col6 == '') and (col6 == -1));
K = group J all;
L = foreach K generate COUNT(J);
store L into '/user/viraj/result4';
{code}
> MultiQuery optimization throws error for multi-level splits
> -----------------------------------------------------------
>
> Key: PIG-1060
> URL: https://issues.apache.org/jira/browse/PIG-1060
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.5.0
> Reporter: Ankur
> Assignee: Richard Ding
>
> Consider the following scenario :-
> 1. Multi-level splits in the map plan.
> 2. Each split branch further progressing across a local-global rearrange.
> 3. Output of each of these finally merged via a UNION.
> MultiQuery optimizer throws the following error in such a case:
> "ERROR 2146: Internal Error. Inconsistency in key index found during optimization."
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.