You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by jiang licht <li...@yahoo.com> on 2010/02/25 21:17:43 UTC
Hadoop freeze?
I ran into the following problem running a hadoop job written in pig.Pls help check what caused the issue. As I could tell, it seems to me the job/task tracker failed for some reason but
name/data nodes still functioning.
The job simply seems to make no progress at all (no output, no log). But couple of other hadoop jobs ran successfully before this one. hadoop fs -ls can still list files. But I did "Hadoop job -list", it took too long and then failed with error message as follows.
Exception in thread "main" java.io.IOException: Call to hostname/ip-address:50002 failed on
local exception: Connection reset by peer at
org.apache.hadoop.ipc.Client.call(Client.java:699) at
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at
org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source) at
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) at
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:435) at
org.apache.hadoop.mapred.JobClient.init(JobClient.java:429) at
org.apache.hadoop.mapred.JobClient.run(JobClient.java:1512) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at
org.apache.hadoop.mapred.JobClient.main(JobClient.java:1727)Caused
by: java.io.IOException: Connection reset by peer at
sun.nio.ch.FileDispatcher.read0(Native Method) at
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at
sun.nio.ch.IOUtil.read(IOUtil.java:206) at
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236) at
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55) at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140) at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150) at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123) at
java.io.FilterInputStream.read(FilterInputStream.java:116) at
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:271) at
java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at
java.io.BufferedInputStream.read(BufferedInputStream.java:237) at
java.io.DataInputStream.readInt(DataInputStream.java:370) at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:493) at
org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)
Web interface to job tracker@50030 simply came with no response at all.
By checking netstat, sometimes it shows 50030 and sometimes not. connections and ports with data nodes were shown there.
Then, if I ran another pig, it failed with the following error:
Error before Pig is launched----------------------------ERROR
6009: Failed to create job client:Call to hostname/ip-address:50002 failed on
local exception: Connection reset by peer
org.apache.pig.backend.executionengine.ExecException:
ERROR 6009: Failed to create job client:Call to hostname/ip-address:50002 failed on
local exception: Connection reset by peer at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:217) at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:137) at
org.apache.pig.impl.PigContext.connect(PigContext.java:199) at
org.apache.pig.PigServer.<init>(PigServer.java:169) at
org.apache.pig.PigServer.<init>(PigServer.java:158) at
org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:54) at
org.apache.pig.Main.main(Main.java:395)Caused by:
java.io.IOException: Call to hostname/ip-address:50002 failed on
local exception: Connection reset by peer at
org.apache.hadoop.ipc.Client.call(Client.java:699) at
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at
org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown Source) at
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) at
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:435) at
org.apache.hadoop.mapred.JobClient.init(JobClient.java:429) at
org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:398) at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:212) ... 6 moreCaused
by: java.io.IOException: Connection reset by peer at
sun.nio.ch.FileDispatcher.read0(Native Method) at
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at
sun.nio.ch.IOUtil.read(IOUtil.java:206) at
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236) at
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55) at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140) at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150) at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123) at
java.io.FilterInputStream.read(FilterInputStream.java:116) at
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:271) at
java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at
java.io.BufferedInputStream.read(BufferedInputStream.java:237) at
java.io.DataInputStream.readInt(DataInputStream.java:370) at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:493) at
org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)================================================================================
Thank,
Michael