You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Tommy Chheng <to...@gmail.com> on 2009/09/25 07:14:53 UTC

hadoop reduce job stalling

i'm trying to run my pig script on ec2 large instance using the Cloudera 
0.18 distribution.The pig script itself works in local mode on a reduced 
data set.

The map phase went by fast but the script is halting on the reduce stage 
at 28%. It doesn't say failed, it just isn't going past 26.8%. It has 
been running for 8 hrs.

Any ideas for a workaround on this exception?
The task log file:

2009-09-25 04:12:06,123 WARN org.apache.hadoop.dfs.DFSClient: 
NotReplicatedYetException sleeping 
/tmp/temp-1270628176/tmp-407746021/_temporary/_attempt_200909241708_0002_r_000000_0/part-00 

000 retries left 2
2009-09-25 04:12:07,744 INFO org.apache.hadoop.dfs.DFSClient: 
org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.dfs.NotReplicatedYetException: Not replicated 
yet:/tmp/temp-127062817
6/tmp-407746021/_temporary/_attempt_200909241708_0002_r_000000_0/part-00000
         at 
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1109) 

         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 

         at java.lang.reflect.Method.invoke(Method.java:597)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)