You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Jack Tang <hi...@gmail.com> on 2005/12/05 17:45:48 UTC

NDFS Connection reset

Hi

I checked out latest source code from svn, and played NDFS according
the tutorial (http://wiki.apache.org/nutch/NutchDistributedFileSystem).
And I tested my NDFS using TestClient. It was odd that when I input
every command, the NameNode would throw exception:

051206 003714 Server connection on port 9000 from 127.0.0.1: starting
051206 003715 Server connection on port 9000 from 127.0.0.1 caught:
java.net.SocketException: Connection reset
java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(Unknown Source)
	at java.io.BufferedInputStream.fill(Unknown Source)
	at java.io.BufferedInputStream.read(Unknown Source)
	at java.io.DataInputStream.readInt(Unknown Source)
	at org.apache.nutch.ipc.Server$Connection.run(Server.java:124)
051206 003715 Server connection on port 9000 from 127.0.0.1: exiting

Is this bug in nutch or something I missed in this tutorial?
Thanks

/Jack
--
Keep Discovering ... ...
http://www.jroller.com/page/jmars

Re: NDFS Connection reset

Posted by Paul Baclace <pe...@baclace.net>.
I have recently seen the connection reset problem, and no firewall was involved.

I have been doing a mapred index build over more than 5TB of arc files and I noticed:
   SocketException: Connection reset
that occurred in 1 of 1070 map tasks during the parse phase; the task was automatically restarted and succeeded on the second attempt.

The problem is chaotic/spurious/intermittent and is probably related to
OS network tuning.  It would be nice to know more about the transient
conditions that are associated with this problem.

I checked all the nodes I'm using and the slave nodes all have
high numbers of dropped RX packets.  Example:

   RX packets:1126673543 errors:0 dropped:163568 overruns:0 frame:0
   TX packets:871110771 errors:3 dropped:0 overruns:3 carrier:0

No slave node stands out in particular.  The master node, by contrast,
has dropped only 4 RX packets during 57 days of uptime.


Paul


Re: NDFS Connection reset

Posted by Jack Tang <hi...@gmail.com>.
Hi Guys

in Server.java

        while (running) {
          int id;
          try {
            id = in.readInt();                    // try to read an id
          } catch (SocketTimeoutException e) {
        	  e.printStackTrace();
            continue;
          }

What's the id meaning? the id of DataNode? why the scoket connectio will reset?
Thanks

/Jack

On 12/6/05, Jack Tang <hi...@gmail.com> wrote:
> Hi
>
> I checked out latest source code from svn, and played NDFS according
> the tutorial (http://wiki.apache.org/nutch/NutchDistributedFileSystem).
> And I tested my NDFS using TestClient. It was odd that when I input
> every command, the NameNode would throw exception:
>
> 051206 003714 Server connection on port 9000 from 127.0.0.1: starting
> 051206 003715 Server connection on port 9000 from 127.0.0.1 caught:
> java.net.SocketException: Connection reset
> java.net.SocketException: Connection reset
>         at java.net.SocketInputStream.read(Unknown Source)
>         at java.io.BufferedInputStream.fill(Unknown Source)
>         at java.io.BufferedInputStream.read(Unknown Source)
>         at java.io.DataInputStream.readInt(Unknown Source)
>         at org.apache.nutch.ipc.Server$Connection.run(Server.java:124)
> 051206 003715 Server connection on port 9000 from 127.0.0.1: exiting
>
> Is this bug in nutch or something I missed in this tutorial?
> Thanks
>
> /Jack
> --
> Keep Discovering ... ...
> http://www.jroller.com/page/jmars
>


--
Keep Discovering ... ...
http://www.jroller.com/page/jmars

Re: NDFS Connection reset

Posted by Paul Baclace <pe...@baclace.net>.
Jack Tang wrote:
> It was odd that when I input
> every command, the NameNode would throw exception:
> 
> 051206 003714 Server connection on port 9000 from 127.0.0.1: starting
> 051206 003715 Server connection on port 9000 from 127.0.0.1 caught:
> java.net.SocketException: Connection reset
> java.net.SocketException: Connection reset
> 	at java.net.SocketInputStream.read(Unknown Source)
> 051206 003715 Server connection on port 9000 from 127.0.0.1: exiting

It should work. Do you have a local firewall getting in the way?
You might want to see what Ethereal or tcpdump can reveal about what is happening.

Paul