You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Xu Zhang (JIRA)" <ji...@apache.org> on 2008/04/04 03:27:24 UTC

[jira] Created: (PIG-186) Pig appears to hang with this Pig script

Pig appears to hang with this Pig script
----------------------------------------

                 Key: PIG-186
                 URL: https://issues.apache.org/jira/browse/PIG-186
             Project: Pig
          Issue Type: Bug
            Reporter: Xu Zhang
            Assignee: Arun C Murthy
            Priority: Critical
         Attachments: DataGuaranteeTest.pl

Pig stoped at progress 56%.   It seemed there had been exceptions on the the reduce task trackers (see below).  But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.

Here is my Pig script:

{code}
define X `./home/xu/streamingscript/DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
B = group A by name;
C = foreach B generate flatten(A);
D = stream C through X;
store D into 'results_24';
{code}

Here is the exception on the reduce task trackers:

{noformat}
java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory
	at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
	at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
	at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
	at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
{noformat}

I will attach DataGuaranteeTest.pl to the report



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-186) Pig appears to hang with this Pig script

Posted by "Xu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xu Zhang updated PIG-186:
-------------------------

    Attachment: DataGuaranteeTest.pl

> Pig appears to hang with this Pig script
> ----------------------------------------
>
>                 Key: PIG-186
>                 URL: https://issues.apache.org/jira/browse/PIG-186
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Xu Zhang
>            Assignee: Arun C Murthy
>            Priority: Critical
>         Attachments: DataGuaranteeTest.pl
>
>
> Pig stoped at progress 56%.   It seemed there had been exceptions on the the reduce task trackers (see below).  But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.
> Here is my Pig script:
> {code}
> define X `./home/xu/streamingscript/DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
> A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
> B = group A by name;
> C = foreach B generate flatten(A);
> D = stream C through X;
> store D into 'results_24';
> {code}
> Here is the exception on the reduce task trackers:
> {noformat}
> java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory
> 	at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
> 	at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
> 	at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
> 	at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> {noformat}
> I will attach DataGuaranteeTest.pl to the report

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-186) Pig appears to hang with this Pig script

Posted by "Xu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xu Zhang updated PIG-186:
-------------------------

    Description: 
Pig stoped at progress 56%.   It seemed there had been exceptions on the the reduce task trackers (see below).  But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.

Here is my Pig script:

{code}
define X `DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
B = group A by name;
C = foreach B generate flatten(A);
D = stream C through X;
store D into 'results_24';
{code}

Here is the exception on the reduce task trackers:

{noformat}
java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory
	at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
	at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
	at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
	at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
{noformat}

I will attach DataGuaranteeTest.pl to the report



  was:
Pig stoped at progress 56%.   It seemed there had been exceptions on the the reduce task trackers (see below).  But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.

Here is my Pig script:

{code}
define X `./home/xu/streamingscript/DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
B = group A by name;
C = foreach B generate flatten(A);
D = stream C through X;
store D into 'results_24';
{code}

Here is the exception on the reduce task trackers:

{noformat}
java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory
	at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
	at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
	at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
	at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
{noformat}

I will attach DataGuaranteeTest.pl to the report




The second form of command is exactly what I used.  I just updated the description of the bug.  Sorry for the confusion.

> Pig appears to hang with this Pig script
> ----------------------------------------
>
>                 Key: PIG-186
>                 URL: https://issues.apache.org/jira/browse/PIG-186
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Xu Zhang
>            Assignee: Arun C Murthy
>            Priority: Critical
>         Attachments: DataGuaranteeTest.pl
>
>
> Pig stoped at progress 56%.   It seemed there had been exceptions on the the reduce task trackers (see below).  But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.
> Here is my Pig script:
> {code}
> define X `DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
> A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
> B = group A by name;
> C = foreach B generate flatten(A);
> D = stream C through X;
> store D into 'results_24';
> {code}
> Here is the exception on the reduce task trackers:
> {noformat}
> java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory
> 	at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
> 	at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
> 	at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
> 	at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> {noformat}
> I will attach DataGuaranteeTest.pl to the report

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-186) Pig appears to hang with this Pig script

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585451#action_12585451 ] 

Arun C Murthy commented on PIG-186:
-----------------------------------

Xu, PIG-182 has a patch which fixes this - could you try and let me know if it solves this problem? Thanks!

> Pig appears to hang with this Pig script
> ----------------------------------------
>
>                 Key: PIG-186
>                 URL: https://issues.apache.org/jira/browse/PIG-186
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Xu Zhang
>            Assignee: Arun C Murthy
>            Priority: Critical
>         Attachments: DataGuaranteeTest.pl
>
>
> Pig stoped at progress 56%.   It seemed there had been exceptions on the the reduce task trackers (see below).  But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.
> Here is my Pig script:
> {code}
> define X `./home/xu/streamingscript/DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
> A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
> B = group A by name;
> C = foreach B generate flatten(A);
> D = stream C through X;
> store D into 'results_24';
> {code}
> Here is the exception on the reduce task trackers:
> {noformat}
> java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory
> 	at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
> 	at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
> 	at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
> 	at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> {noformat}
> I will attach DataGuaranteeTest.pl to the report

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (PIG-186) Pig appears to hang with this Pig script

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich resolved PIG-186.
--------------------------------

    Resolution: Fixed

Fixed with patch to PIG-182.

> Pig appears to hang with this Pig script
> ----------------------------------------
>
>                 Key: PIG-186
>                 URL: https://issues.apache.org/jira/browse/PIG-186
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Xu Zhang
>            Assignee: Arun C Murthy
>            Priority: Critical
>         Attachments: DataGuaranteeTest.pl
>
>
> Pig stoped at progress 56%.   It seemed there had been exceptions on the the reduce task trackers (see below).  But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.
> Here is my Pig script:
> {code}
> define X `DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
> A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
> B = group A by name;
> C = foreach B generate flatten(A);
> D = stream C through X;
> store D into 'results_24';
> {code}
> Here is the exception on the reduce task trackers:
> {noformat}
> java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory
> 	at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
> 	at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
> 	at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
> 	at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> {noformat}
> I will attach DataGuaranteeTest.pl to the report

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-186) Pig appears to hang with this Pig script

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586521#action_12586521 ] 

Arun C Murthy commented on PIG-186:
-----------------------------------

Xu, you are getting broken pipe since your command specification 
{noformat}
define X `./home/xu/streamingscript/DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
{noformat}
is in-correct as per the new *ship* specifications (outlined in the wiki) ... 

Your command should be:
{noformat}
define X `DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
{noformat}

Could you please check against that? Thanks!


> Pig appears to hang with this Pig script
> ----------------------------------------
>
>                 Key: PIG-186
>                 URL: https://issues.apache.org/jira/browse/PIG-186
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Xu Zhang
>            Assignee: Arun C Murthy
>            Priority: Critical
>         Attachments: DataGuaranteeTest.pl
>
>
> Pig stoped at progress 56%.   It seemed there had been exceptions on the the reduce task trackers (see below).  But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.
> Here is my Pig script:
> {code}
> define X `./home/xu/streamingscript/DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
> A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
> B = group A by name;
> C = foreach B generate flatten(A);
> D = stream C through X;
> store D into 'results_24';
> {code}
> Here is the exception on the reduce task trackers:
> {noformat}
> java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory
> 	at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
> 	at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
> 	at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
> 	at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> {noformat}
> I will attach DataGuaranteeTest.pl to the report

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-186) Pig appears to hang with this Pig script

Posted by "Xu Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586490#action_12586490 ] 

Xu Zhang commented on PIG-186:
------------------------------

The "No such file or directory" exception is gone, but I got "broken pipe" exception instead.  See the tasks UI for job_200804041056_0346 .

> Pig appears to hang with this Pig script
> ----------------------------------------
>
>                 Key: PIG-186
>                 URL: https://issues.apache.org/jira/browse/PIG-186
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Xu Zhang
>            Assignee: Arun C Murthy
>            Priority: Critical
>         Attachments: DataGuaranteeTest.pl
>
>
> Pig stoped at progress 56%.   It seemed there had been exceptions on the the reduce task trackers (see below).  But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.
> Here is my Pig script:
> {code}
> define X `./home/xu/streamingscript/DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
> A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
> B = group A by name;
> C = foreach B generate flatten(A);
> D = stream C through X;
> store D into 'results_24';
> {code}
> Here is the exception on the reduce task trackers:
> {noformat}
> java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory
> 	at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
> 	at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
> 	at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
> 	at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
> 	at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> {noformat}
> I will attach DataGuaranteeTest.pl to the report

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.