You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2009/12/01 07:58:20 UTC
[jira] Commented: (PIG-978) ERROR 2100
(hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) and
ERROR 2999: (Unexpected internal error. null) when using Multi-Query
optimization
[ https://issues.apache.org/jira/browse/PIG-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784087#action_12784087 ]
Hadoop QA commented on PIG-978:
-------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12426454/pig-latin-users-guide.patch
against trunk revision 885465.
+1 @author. The patch does not contain any @author tags.
+0 tests included. The patch appears to be a documentation patch that doesn't require tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/69/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/69/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/69/console
This message is automatically generated.
> ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) and ERROR 2999: (Unexpected internal error. null) when using Multi-Query optimization
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: PIG-978
> URL: https://issues.apache.org/jira/browse/PIG-978
> Project: Pig
> Issue Type: Bug
> Components: documentation
> Affects Versions: 0.6.0
> Reporter: Viraj Bhat
> Assignee: Corinne Chandel
> Fix For: 0.6.0
>
> Attachments: pig-latin-users-guide.patch
>
>
> I have Pig script of this form.. which I execute using Multi-query optimization.
> {code}
> A = load '/user/viraj/firstinput' using PigStorage();
> B = group ....
> C = ..agrregation function
> store C into '/user/viraj/firstinputtempresult/days1';
> ..
> Atab = load '/user/viraj/secondinput' using PigStorage();
> Btab = group ....
> Ctab = ..agrregation function
> store Ctab into '/user/viraj/secondinputtempresult/days1';
> ..
> E = load '/user/viraj/firstinputtempresult/' using PigStorage();
> F = group
> G = aggregation function
> store G into '/user/viraj/finalresult1';
> Etab = load '/user/viraj/secondinputtempresult/' using PigStorage();
> Ftab = group
> Gtab = aggregation function
> store Gtab into '/user/viraj/finalresult2';
> {code}
> 2009-07-20 22:05:44,507 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2100: hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist. Details at logfile: /homes/viraj/pigscripts/pig_1248127173601.log)
> is due to the mismatch of store/load commands. The script first stores files into the 'days1' directory (store C into '/user/viraj/firstinputtempresult/days1' using PigStorage();), but it later loads from the top level directory (E = load '/user/viraj/firstinputtempresult/' using PigStorage()) instead of the original directory (/user/viraj/firstinputtempresult/days1).
> The current multi-query optimizer can't solve the dependency between these two commands--they have different load file paths. So the jobs will run concurrently and result in the errors.
> The solution is to add 'exec' or 'run' command after the first two stores . This will force the first two store commands to run before the rest commands.
> It would be nice to see this fixed as a part of an enhancement to the Multi-query. We either disable the Multi-query or throw a warning/error message, so that the user can correct his load/store statements.
> Viraj
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.