You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by jiang licht <li...@yahoo.com> on 2010/06/24 02:58:12 UTC
how to check what goes wrong with failed pig jobs?
I have a pig job which supposes to save some result in the hdfs. Most time it runs just fine. But it failed once and no result was generated. There was no dead node. And namenode, tasktracker and jobtracker were all up and running. The following is the pig job log. It did show some exception but that is not logged as an error. And no obvious clues can be found in logs of nn, tasktracker and jobtracker when the failure happened.
So, how to tell what goes wrong when things like this happens? Any pointer and thoughts? Thanks,
2010-06-22 01:45:06,029 [main] INFO org.apache.pig.Main - Logging error messages to: /tmp/logs/pig_1277189106028.log
2010-06-22 01:45:06,516 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://hadoopmaster:50001
2010-06-22 01:45:07,211 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: hadoopmaster:50002
2010-06-22 01:45:07,910 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
2010-06-22 01:45:07,911 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 2 map-only splittees.
2010-06-22 01:45:07,911 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 2 out of total 2 splittees.
2010-06-22 01:45:07,911 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2010-06-22 01:45:09,163 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up multi store job
2010-06-22 01:45:09,171 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting identity combiner class.
2010-06-22 01:45:09,270 [Thread-10] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2010-06-22 01:45:09,767 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2010-06-22 01:46:13,385 [Thread-15] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.13:50010
2010-06-22 01:46:13,399 [Thread-15] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-3238247195956896658_451260
2010-06-22 01:46:13,402 [Thread-15] INFO org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.12:50010
2010-06-22 01:47:19,410 [Thread-15] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.12:50010
2010-06-22 01:47:19,410 [Thread-15] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-8967610649184651127_451260
2010-06-22 01:47:19,412 [Thread-15] INFO org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.11:50010
2010-06-22 01:48:26,267 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.3:50010
2010-06-22 01:48:26,267 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_9202045569864855438_451264
2010-06-22 01:48:26,283 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.12:50010
2010-06-22 01:49:32,317 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.3:50010
2010-06-22 01:49:32,317 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_62775741158062430_451266
2010-06-22 01:49:32,320 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.5:50010
2010-06-22 01:50:38,326 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.11:50010
2010-06-22 01:50:38,326 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_4600730976353859294_451267
2010-06-22 01:50:38,328 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.10:50010
2010-06-22 01:51:44,338 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 192.168.133.11:50010
2010-06-22 01:51:44,385 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_274719573196033811_451267
2010-06-22 01:51:44,387 [Thread-21] INFO org.apache.hadoop.hdfs.DFSClient - Waiting to find target node: 192.168.133.12:50010
2010-06-22 01:51:50,389 [Thread-21] WARN org.apache.hadoop.hdfs.DFSClient - DataStreamer Exception: java.io.IOException: Unable to create new block.
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2812)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)
2010-06-22 01:51:50,390 [Thread-21] WARN org.apache.hadoop.hdfs.DFSClient - Error Recovery for block blk_274719573196033811_451267 bad datanode[1] nodes == null
2010-06-22 01:51:50,390 [Thread-21] WARN org.apache.hadoop.hdfs.DFSClient - Could not get block locations. Source file "/hadoop/hadoop-hadoop/mapred/system/job_201006211221_0037/job.xml" - Aborting...
ERROR: 'Bad connect ack with firstBadLink 192.168.133.11:50010'
2010-06-22 01:51:50,620 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2010-06-22 01:51:50,621 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
--Michael