You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by jiang licht <li...@yahoo.com> on 2010/06/24 02:58:12 UTC

how to check what goes wrong with failed pig jobs?

I have a pig job which supposes to save some result in the hdfs. Most time it runs just fine. But it failed once and no result was generated. There was no dead node. And namenode, tasktracker and jobtracker were all up and running. The following is the pig job log. It did show some exception but that is not logged as an error. And no obvious clues can be found in logs of nn, tasktracker and jobtracker when the failure happened.

So, how to tell what goes wrong when things like this happens? Any pointer and thoughts? Thanks,
 
2010-06-22 01:45:06,029 [main] INFO  org.apache.pig.Main - Logging error messages to: /tmp/logs/pig_1277189106028.log
2010-06-22 01:45:06,516 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://hadoopmaster:50001
2010-06-22 01:45:07,211 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: hadoopmaster:50002
2010-06-22 01:45:07,910 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
2010-06-22 01:45:07,911 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 2 map-only splittees.
2010-06-22 01:45:07,911 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 2 out of total 2 splittees.
2010-06-22 01:45:07,911 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2010-06-22 01:45:09,163 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up multi store job
2010-06-22 01:45:09,171 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting identity combiner class.
2010-06-22 01:45:09,270 [Thread-10] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2010-06-22 01:45:09,767 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2010-06-22 01:46:13,385 [Thread-15] INFO  org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.13:50010
2010-06-22 01:46:13,399 [Thread-15] INFO  org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-3238247195956896658_451260
2010-06-22 01:46:13,402 [Thread-15] INFO  org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.12:50010
2010-06-22 01:47:19,410 [Thread-15] INFO  org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.12:50010
2010-06-22 01:47:19,410 [Thread-15] INFO  org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-8967610649184651127_451260
2010-06-22 01:47:19,412 [Thread-15] INFO  org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.11:50010
2010-06-22 01:48:26,267 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.3:50010
2010-06-22 01:48:26,267 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_9202045569864855438_451264
2010-06-22 01:48:26,283 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.12:50010
2010-06-22 01:49:32,317 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.3:50010
2010-06-22 01:49:32,317 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_62775741158062430_451266
2010-06-22 01:49:32,320 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.5:50010
2010-06-22 01:50:38,326 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.11:50010
2010-06-22 01:50:38,326 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_4600730976353859294_451267
2010-06-22 01:50:38,328 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.10:50010
2010-06-22 01:51:44,338 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.11:50010
2010-06-22 01:51:44,385 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_274719573196033811_451267
2010-06-22 01:51:44,387 [Thread-21] INFO  org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.12:50010
2010-06-22 01:51:50,389 [Thread-21] WARN  org.apache.hadoop.hdfs.DFSClient - DataStreamer Exception: java.io.IOException: Unable to create new block.
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2812)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)

2010-06-22 01:51:50,390 [Thread-21] WARN  org.apache.hadoop.hdfs.DFSClient - Error Recovery for block blk_274719573196033811_451267 bad datanode[1] nodes == null
2010-06-22 01:51:50,390 [Thread-21] WARN  org.apache.hadoop.hdfs.DFSClient - Could not get block locations. Source file "/hadoop/hadoop-hadoop/mapred/system/job_201006211221_0037/job.xml" - Aborting...
ERROR:  'Bad connect ack with firstBadLink 192.168.133.11:50010'
2010-06-22 01:51:50,620 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2010-06-22 01:51:50,621 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!

--Michael