You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Jack Tang <hi...@gmail.com> on 2005/12/05 17:45:48 UTC
NDFS Connection reset
Hi
I checked out latest source code from svn, and played NDFS according
the tutorial (http://wiki.apache.org/nutch/NutchDistributedFileSystem).
And I tested my NDFS using TestClient. It was odd that when I input
every command, the NameNode would throw exception:
051206 003714 Server connection on port 9000 from 127.0.0.1: starting
051206 003715 Server connection on port 9000 from 127.0.0.1 caught:
java.net.SocketException: Connection reset
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.nutch.ipc.Server$Connection.run(Server.java:124)
051206 003715 Server connection on port 9000 from 127.0.0.1: exiting
Is this bug in nutch or something I missed in this tutorial?
Thanks
/Jack
--
Keep Discovering ... ...
http://www.jroller.com/page/jmars
Re: NDFS Connection reset
Posted by Paul Baclace <pe...@baclace.net>.
I have recently seen the connection reset problem, and no firewall was involved.
I have been doing a mapred index build over more than 5TB of arc files and I noticed:
SocketException: Connection reset
that occurred in 1 of 1070 map tasks during the parse phase; the task was automatically restarted and succeeded on the second attempt.
The problem is chaotic/spurious/intermittent and is probably related to
OS network tuning. It would be nice to know more about the transient
conditions that are associated with this problem.
I checked all the nodes I'm using and the slave nodes all have
high numbers of dropped RX packets. Example:
RX packets:1126673543 errors:0 dropped:163568 overruns:0 frame:0
TX packets:871110771 errors:3 dropped:0 overruns:3 carrier:0
No slave node stands out in particular. The master node, by contrast,
has dropped only 4 RX packets during 57 days of uptime.
Paul
Re: NDFS Connection reset
Posted by Jack Tang <hi...@gmail.com>.
Hi Guys
in Server.java
while (running) {
int id;
try {
id = in.readInt(); // try to read an id
} catch (SocketTimeoutException e) {
e.printStackTrace();
continue;
}
What's the id meaning? the id of DataNode? why the scoket connectio will reset?
Thanks
/Jack
On 12/6/05, Jack Tang <hi...@gmail.com> wrote:
> Hi
>
> I checked out latest source code from svn, and played NDFS according
> the tutorial (http://wiki.apache.org/nutch/NutchDistributedFileSystem).
> And I tested my NDFS using TestClient. It was odd that when I input
> every command, the NameNode would throw exception:
>
> 051206 003714 Server connection on port 9000 from 127.0.0.1: starting
> 051206 003715 Server connection on port 9000 from 127.0.0.1 caught:
> java.net.SocketException: Connection reset
> java.net.SocketException: Connection reset
> at java.net.SocketInputStream.read(Unknown Source)
> at java.io.BufferedInputStream.fill(Unknown Source)
> at java.io.BufferedInputStream.read(Unknown Source)
> at java.io.DataInputStream.readInt(Unknown Source)
> at org.apache.nutch.ipc.Server$Connection.run(Server.java:124)
> 051206 003715 Server connection on port 9000 from 127.0.0.1: exiting
>
> Is this bug in nutch or something I missed in this tutorial?
> Thanks
>
> /Jack
> --
> Keep Discovering ... ...
> http://www.jroller.com/page/jmars
>
--
Keep Discovering ... ...
http://www.jroller.com/page/jmars
Re: NDFS Connection reset
Posted by Paul Baclace <pe...@baclace.net>.
Jack Tang wrote:
> It was odd that when I input
> every command, the NameNode would throw exception:
>
> 051206 003714 Server connection on port 9000 from 127.0.0.1: starting
> 051206 003715 Server connection on port 9000 from 127.0.0.1 caught:
> java.net.SocketException: Connection reset
> java.net.SocketException: Connection reset
> at java.net.SocketInputStream.read(Unknown Source)
> 051206 003715 Server connection on port 9000 from 127.0.0.1: exiting
It should work. Do you have a local firewall getting in the way?
You might want to see what Ethereal or tcpdump can reveal about what is happening.
Paul