You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2006/06/07 01:08:29 UTC

[jira] Created: (HADOOP-282) the datanode crashes if it starts before the namenode

the datanode crashes if it starts before the namenode
-----------------------------------------------------

         Key: HADOOP-282
         URL: http://issues.apache.org/jira/browse/HADOOP-282
     Project: Hadoop
        Type: Bug

  Components: dfs  
    Versions: 0.3.1    
    Reporter: Owen O'Malley
 Assigned to: Owen O'Malley 
    Priority: Critical
     Fix For: 0.3.2
 Attachments: data-node-first.patch

If the datanode tries to register before the namenode is offering service, it crashes with a uncaught exception.

java.net.ConnectE
xception: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:507)
        at java.net.Socket.connect(Socket.java:457)
        at java.net.Socket.<init>(Socket.java:365)
        at java.net.Socket.<init>(Socket.java:207)
        at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:112)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:351)
        at org.apache.hadoop.ipc.Client.call(Client.java:289)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
        at org.apache.hadoop.dfs.$Proxy0.register(Unknown Source)
        at org.apache.hadoop.dfs.DataNode.register(DataNode.java:176)
        at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:109)
        at org.apache.hadoop.dfs.DataNode.makeInstanceForDir(DataNode.java:892)
        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:846)
        at org.apache.hadoop.dfs.DataNode.runAndWait(DataNode.java:862)
        at org.apache.hadoop.dfs.DataNode.main(DataNode.java:917)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-282) the datanode crashes if it starts before the namenode

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-282?page=all ]

Owen O'Malley updated HADOOP-282:
---------------------------------

    Attachment: data-node-first-2.patch

Ok, this patch fixes the problem without the bad side effects that the old patch had. It passes the regressions and I'm currently installing this on my 200 node cluster. I'll post a message when I can tell how it goes.

> the datanode crashes if it starts before the namenode
> -----------------------------------------------------
>
>          Key: HADOOP-282
>          URL: http://issues.apache.org/jira/browse/HADOOP-282
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.3.1
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>     Priority: Critical
>      Fix For: 0.3.2
>  Attachments: data-node-first-2.patch, data-node-first.patch
>
> If the datanode tries to register before the namenode is offering service, it crashes with a uncaught exception.
> java.net.ConnectE
> xception: Connection refused
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>         at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>         at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>         at java.net.Socket.connect(Socket.java:507)
>         at java.net.Socket.connect(Socket.java:457)
>         at java.net.Socket.<init>(Socket.java:365)
>         at java.net.Socket.<init>(Socket.java:207)
>         at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:112)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:351)
>         at org.apache.hadoop.ipc.Client.call(Client.java:289)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
>         at org.apache.hadoop.dfs.$Proxy0.register(Unknown Source)
>         at org.apache.hadoop.dfs.DataNode.register(DataNode.java:176)
>         at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:109)
>         at org.apache.hadoop.dfs.DataNode.makeInstanceForDir(DataNode.java:892)
>         at org.apache.hadoop.dfs.DataNode.run(DataNode.java:846)
>         at org.apache.hadoop.dfs.DataNode.runAndWait(DataNode.java:862)
>         at org.apache.hadoop.dfs.DataNode.main(DataNode.java:917)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Resolved: (HADOOP-282) the datanode crashes if it starts before the namenode

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-282?page=all ]
     
Doug Cutting resolved HADOOP-282:
---------------------------------

    Resolution: Fixed

I just committed this.  Thanks!

> the datanode crashes if it starts before the namenode
> -----------------------------------------------------
>
>          Key: HADOOP-282
>          URL: http://issues.apache.org/jira/browse/HADOOP-282
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.3.1
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>     Priority: Critical
>      Fix For: 0.3.2
>  Attachments: data-node-first-2.patch, data-node-first.patch
>
> If the datanode tries to register before the namenode is offering service, it crashes with a uncaught exception.
> java.net.ConnectE
> xception: Connection refused
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>         at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>         at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>         at java.net.Socket.connect(Socket.java:507)
>         at java.net.Socket.connect(Socket.java:457)
>         at java.net.Socket.<init>(Socket.java:365)
>         at java.net.Socket.<init>(Socket.java:207)
>         at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:112)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:351)
>         at org.apache.hadoop.ipc.Client.call(Client.java:289)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
>         at org.apache.hadoop.dfs.$Proxy0.register(Unknown Source)
>         at org.apache.hadoop.dfs.DataNode.register(DataNode.java:176)
>         at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:109)
>         at org.apache.hadoop.dfs.DataNode.makeInstanceForDir(DataNode.java:892)
>         at org.apache.hadoop.dfs.DataNode.run(DataNode.java:846)
>         at org.apache.hadoop.dfs.DataNode.runAndWait(DataNode.java:862)
>         at org.apache.hadoop.dfs.DataNode.main(DataNode.java:917)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-282) the datanode crashes if it starts before the namenode

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-282?page=all ]

Owen O'Malley updated HADOOP-282:
---------------------------------

    Attachment: data-node-first.patch

This patch moves the datanode registration into the offerService method, so that if an exception is thrown it will retry again in 5 seconds.

> the datanode crashes if it starts before the namenode
> -----------------------------------------------------
>
>          Key: HADOOP-282
>          URL: http://issues.apache.org/jira/browse/HADOOP-282
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.3.1
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>     Priority: Critical
>      Fix For: 0.3.2
>  Attachments: data-node-first.patch
>
> If the datanode tries to register before the namenode is offering service, it crashes with a uncaught exception.
> java.net.ConnectE
> xception: Connection refused
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>         at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>         at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>         at java.net.Socket.connect(Socket.java:507)
>         at java.net.Socket.connect(Socket.java:457)
>         at java.net.Socket.<init>(Socket.java:365)
>         at java.net.Socket.<init>(Socket.java:207)
>         at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:112)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:351)
>         at org.apache.hadoop.ipc.Client.call(Client.java:289)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
>         at org.apache.hadoop.dfs.$Proxy0.register(Unknown Source)
>         at org.apache.hadoop.dfs.DataNode.register(DataNode.java:176)
>         at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:109)
>         at org.apache.hadoop.dfs.DataNode.makeInstanceForDir(DataNode.java:892)
>         at org.apache.hadoop.dfs.DataNode.run(DataNode.java:846)
>         at org.apache.hadoop.dfs.DataNode.runAndWait(DataNode.java:862)
>         at org.apache.hadoop.dfs.DataNode.main(DataNode.java:917)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira