You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Michael <mi...@gameservice.ru> on 2005/09/27 03:29:12 UTC

Re[2]: java.io.IOException: Task process exit with nonzero status

DC> What version of the mapred branch are you running?  I fixed a bug a week
DC> and a half ago that could cause this.  There was a filehandle leak that
DC> resulted in this error after a tasktracker had run more than around 800
DC> tasks.  If you have not updated your code recently, please try that.

It seems that new version fixed this problem, i haven't seen this
error anymore, but new problem arised during indexing process (i'm
using mapred revision 291801):

i'm trying to index via "./nutch index", segments were created by slightly
modificated version of crawl.Crawl class. With 1-2 segments everything
works ok, with about 20 segments task tracker logs on both servers
show repeating error block:

050926 180831 task_r_o4tt4z Got 1 map output locations.
050926 180831 Client connection to 127.0.0.1:60218: starting
050926 180831 Server connection on port 60218 from 127.0.0.1: starting
050926 180831 Client connection to 127.0.0.1:60218 caught: java.lang.IndexOutOfBoundsException
java.lang.IndexOutOfBoundsException
        at java.io.DataInputStream.readFully(DataInputStream.java:263)
        at org.apache.nutch.mapred.MapOutputFile.readFields(MapOutputFile.java:123)
        at org.apache.nutch.io.ObjectWritable.readObject(ObjectWritable.java:232)
        at org.apache.nutch.io.ObjectWritable.readFields(ObjectWritable.java:60)
        at org.apache.nutch.ipc.Client$Connection.run(Client.java:163)
050926 180831 Client connection to 127.0.0.1:60218: closing
050926 180831 Server handler on 60218 caught: java.net.SocketException: Connection reset
java.net.SocketException: Connection reset
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:106)
        at java.io.DataOutputStream.write(DataOutputStream.java:85)
        at org.apache.nutch.mapred.MapOutputFile.write(MapOutputFile.java:98)
        at org.apache.nutch.io.ObjectWritable.writeObject(ObjectWritable.java:117)
        at org.apache.nutch.io.ObjectWritable.write(ObjectWritable.java:64)
        at org.apache.nutch.ipc.Server$Handler.run(Server.java:213)
050926 180831 Server connection on port 60218 from 127.0.0.1: exiting
050926 180931 task_r_o4tt4z copy failed: task_m_ypindn from goku1.deeptown.net/127.0.0.1:60218
java.io.IOException: timed out waiting for response
        at org.apache.nutch.ipc.Client.call(Client.java:296)
        at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
        at $Proxy2.getFile(Unknown Source)
        at org.apache.nutch.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:94)
        at org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:61)






Michael


Re: java.io.IOException: Task process exit with nonzero status

Posted by Doug Cutting <cu...@nutch.org>.
Michael wrote:
> I'm not sure how java interacts with auto type casting (i'm from C
> world), and i found, that java don't support unsigned int, so i did
> like this:
> 
>      int bytesToRead=buffer.length;
>      if(((int)unread)>0)
>      {
>             bytesToRead = Math.min((int) unread, buffer.length);
>      }

I think the patch I proposed is simpler and correct.  I committed it.

> Yes, this helped me, though i don't understand why others haven't
> experienced such problem.

I think the reason that I have not seen it is is that I usually run 
hundreds of map tasks, and the output of a single map task has never 
been greater than 2GB.

Thanks for catching this!

Doug

Re[2]: java.io.IOException: Task process exit with nonzero status

Posted by Michael <mi...@gameservice.ru>.
I'm not sure how java interacts with auto type casting (i'm from C
world), and i found, that java don't support unsigned int, so i did
like this:

     int bytesToRead=buffer.length;
     if(((int)unread)>0)
     {
            bytesToRead = Math.min((int) unread, buffer.length);
     }
     
Yes, this helped me, though i don't understand why others haven't
experienced such problem.

>> I think i found the problem:
>> At MapOutputFile.java:123
>> bytesToRead = Math.min((int) unread, buffer.length);
>> 
>> if unread is greater then 2^31, bytesToRead will be negative.

DC> So the fix is to change this to:

DC> bytesToRead = (int)Math.min(unread, buffer.length);

DC> Right?  Does this fix things for you?  If so, I'll commit it.

DC> Thanks,

DC> Doug



Michael


Re: java.io.IOException: Task process exit with nonzero status

Posted by Doug Cutting <cu...@nutch.org>.
Michael wrote:
> I think i found the problem:
> At MapOutputFile.java:123
> bytesToRead = Math.min((int) unread, buffer.length);
> 
> if unread is greater then 2^31, bytesToRead will be negative.

So the fix is to change this to:

bytesToRead = (int)Math.min(unread, buffer.length);

Right?  Does this fix things for you?  If so, I'll commit it.

Thanks,

Doug

Re[3]: java.io.IOException: Task process exit with nonzero status

Posted by Michael <mi...@gameservice.ru>.
I think i found the problem:
At MapOutputFile.java:123
bytesToRead = Math.min((int) unread, buffer.length);

if unread is greater then 2^31, bytesToRead will be negative.

M> It seems that new version fixed this problem, i haven't seen this
M> error anymore, but new problem arised during indexing process (i'm
M> using mapred revision 291801):

M> i'm trying to index via "./nutch index", segments were created by slightly
M> modificated version of crawl.Crawl class. With 1-2 segments everything
M> works ok, with about 20 segments task tracker logs on both servers
M> show repeating error block:

M> 050926 180831 task_r_o4tt4z Got 1 map output locations.
M> 050926 180831 Client connection to 127.0.0.1:60218: starting
M> 050926 180831 Server connection on port 60218 from 127.0.0.1: starting
M> 050926 180831 Client connection to 127.0.0.1:60218 caught:
M> java.lang.IndexOutOfBoundsException
M> java.lang.IndexOutOfBoundsException
M>         at
M> java.io.DataInputStream.readFully(DataInputStream.java:263)
M>         at
M> org.apache.nutch.mapred.MapOutputFile.readFields(MapOutputFile.java:123)
M>         at
M> org.apache.nutch.io.ObjectWritable.readObject(ObjectWritable.java:232)
M>         at
M> org.apache.nutch.io.ObjectWritable.readFields(ObjectWritable.java:60)
M>         at
M> org.apache.nutch.ipc.Client$Connection.run(Client.java:163)
M> 050926 180831 Client connection to 127.0.0.1:60218: closing
M> 050926 180831 Server handler on 60218 caught:
M> java.net.SocketException: Connection reset
M> java.net.SocketException: Connection reset
M>         at
M> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
M>         at
M> java.net.SocketOutputStream.write(SocketOutputStream.java:136)
M>         at
M> java.io.BufferedOutputStream.write(BufferedOutputStream.java:106)
M>         at java.io.DataOutputStream.write(DataOutputStream.java:85)
M>         at
M> org.apache.nutch.mapred.MapOutputFile.write(MapOutputFile.java:98)
M>         at
M> org.apache.nutch.io.ObjectWritable.writeObject(ObjectWritable.java:117)
M>         at
M> org.apache.nutch.io.ObjectWritable.write(ObjectWritable.java:64)
M>         at org.apache.nutch.ipc.Server$Handler.run(Server.java:213)
M> 050926 180831 Server connection on port 60218 from 127.0.0.1: exiting
M> 050926 180931 task_r_o4tt4z copy failed: task_m_ypindn from
M> goku1.deeptown.net/127.0.0.1:60218
M> java.io.IOException: timed out waiting for response
M>         at org.apache.nutch.ipc.Client.call(Client.java:296)
M>         at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
M>         at $Proxy2.getFile(Unknown Source)
M>         at
M> org.apache.nutch.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:94)
M>         at
M> org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:61)






Michael