You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Dima Brodsky (JIRA)" <ji...@apache.org> on 2008/07/21 19:39:31 UTC

[jira] Created: (HADOOP-3802) IPC - Heartbeat exceptions filling up log files

IPC - Heartbeat exceptions filling up log files
-----------------------------------------------

                 Key: HADOOP-3802
                 URL: https://issues.apache.org/jira/browse/HADOOP-3802
             Project: Hadoop Core
          Issue Type: Bug
          Components: ipc
    Affects Versions: 0.17.1
         Environment: Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:59:20 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux -- Fedora Core 4


            Reporter: Dima Brodsky


We have a datanode (10.0.0.93) that is in a semi-live state.  An ssh session is able to do a connect but is unable to send or receive any data.  The connection is closed immediately after the connection is established.  Because of this, the name node's logs are full of the following message:

2008-07-21 09:36:10,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1on 9000, call sendHeartbeat(10.0.0.193:50010, 917638877184, 20480, 644100882925, 0, 0) from 10.0.0.193:55908: error: org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
        at org.apache.hadoop.dfs.NameNode.verifyVersion(NameNode.java:682)
        at org.apache.hadoop.dfs.NameNode.verifyRequest(NameNode.java:669)
        at org.apache.hadoop.dfs.NameNode.sendHeartbeat(NameNode.java:557)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

Approximately generating 100MB a minute.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3802) IPC - Heartbeat exceptions filling up log files

Posted by "Dima Brodsky (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615326#action_12615326 ] 

Dima Brodsky commented on HADOOP-3802:
--------------------------------------

The datanode (10.0.0.93) still has version 0.16.(3|4) running (I think) and thus injecting wrong versioned heartbeats into the system.  I.e. it has become a rogue node.

> IPC - Heartbeat exceptions filling up log files
> -----------------------------------------------
>
>                 Key: HADOOP-3802
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3802
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.17.1
>         Environment: Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:59:20 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux -- Fedora Core 4
>            Reporter: Dima Brodsky
>
> We have a datanode (10.0.0.93) that is in a semi-live state.  An ssh session is able to do a connect but is unable to send or receive any data.  The connection is closed immediately after the connection is established.  Because of this, the name node's logs are full of the following message:
> 2008-07-21 09:36:10,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1on 9000, call sendHeartbeat(10.0.0.193:50010, 917638877184, 20480, 644100882925, 0, 0) from 10.0.0.193:55908: error: org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
> org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
>         at org.apache.hadoop.dfs.NameNode.verifyVersion(NameNode.java:682)
>         at org.apache.hadoop.dfs.NameNode.verifyRequest(NameNode.java:669)
>         at org.apache.hadoop.dfs.NameNode.sendHeartbeat(NameNode.java:557)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
> Approximately generating 100MB a minute.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-3802) IPC - Heartbeat exceptions filling up log files

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi resolved HADOOP-3802.
----------------------------------

    Resolution: Duplicate

> IPC - Heartbeat exceptions filling up log files
> -----------------------------------------------
>
>                 Key: HADOOP-3802
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3802
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.17.1
>         Environment: Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:59:20 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux -- Fedora Core 4
>            Reporter: Dima Brodsky
>
> We have a datanode (10.0.0.93) that is in a semi-live state.  An ssh session is able to do a connect but is unable to send or receive any data.  The connection is closed immediately after the connection is established.  Because of this, the name node's logs are full of the following message:
> 2008-07-21 09:36:10,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1on 9000, call sendHeartbeat(10.0.0.193:50010, 917638877184, 20480, 644100882925, 0, 0) from 10.0.0.193:55908: error: org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
> org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
>         at org.apache.hadoop.dfs.NameNode.verifyVersion(NameNode.java:682)
>         at org.apache.hadoop.dfs.NameNode.verifyRequest(NameNode.java:669)
>         at org.apache.hadoop.dfs.NameNode.sendHeartbeat(NameNode.java:557)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
> Approximately generating 100MB a minute.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3802) IPC - Heartbeat exceptions filling up log files

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615342#action_12615342 ] 

Konstantin Shvachko commented on HADOOP-3802:
---------------------------------------------

This sounds like a bug.
Data-nodes should shut down if their versions do not match the required one.
I believe we had this behavior in the past.
In DataNode.offerService() the IncorrectVersionException should be treated the same way as
UnregisteredDatanodeException and DisallowedDatanodeException.
May be we should introduce a base type of exceptions like CriticalDatanodeException that would require 
data-nodes to shutdown.

> IPC - Heartbeat exceptions filling up log files
> -----------------------------------------------
>
>                 Key: HADOOP-3802
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3802
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.17.1
>         Environment: Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:59:20 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux -- Fedora Core 4
>            Reporter: Dima Brodsky
>
> We have a datanode (10.0.0.93) that is in a semi-live state.  An ssh session is able to do a connect but is unable to send or receive any data.  The connection is closed immediately after the connection is established.  Because of this, the name node's logs are full of the following message:
> 2008-07-21 09:36:10,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1on 9000, call sendHeartbeat(10.0.0.193:50010, 917638877184, 20480, 644100882925, 0, 0) from 10.0.0.193:55908: error: org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
> org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
>         at org.apache.hadoop.dfs.NameNode.verifyVersion(NameNode.java:682)
>         at org.apache.hadoop.dfs.NameNode.verifyRequest(NameNode.java:669)
>         at org.apache.hadoop.dfs.NameNode.sendHeartbeat(NameNode.java:557)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
> Approximately generating 100MB a minute.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3802) IPC - Heartbeat exceptions filling up log files

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615387#action_12615387 ] 

Raghu Angadi commented on HADOOP-3802:
--------------------------------------

HADOOP-3758 fixes the the DataNode so that it shutsdown shutdown.

> IPC - Heartbeat exceptions filling up log files
> -----------------------------------------------
>
>                 Key: HADOOP-3802
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3802
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.17.1
>         Environment: Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:59:20 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux -- Fedora Core 4
>            Reporter: Dima Brodsky
>
> We have a datanode (10.0.0.93) that is in a semi-live state.  An ssh session is able to do a connect but is unable to send or receive any data.  The connection is closed immediately after the connection is established.  Because of this, the name node's logs are full of the following message:
> 2008-07-21 09:36:10,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1on 9000, call sendHeartbeat(10.0.0.193:50010, 917638877184, 20480, 644100882925, 0, 0) from 10.0.0.193:55908: error: org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
> org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
>         at org.apache.hadoop.dfs.NameNode.verifyVersion(NameNode.java:682)
>         at org.apache.hadoop.dfs.NameNode.verifyRequest(NameNode.java:669)
>         at org.apache.hadoop.dfs.NameNode.sendHeartbeat(NameNode.java:557)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
> Approximately generating 100MB a minute.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3802) IPC - Heartbeat exceptions filling up log files

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615331#action_12615331 ] 

Lohit Vijayarenu commented on HADOOP-3802:
------------------------------------------

There is a fix for in HADOOP-3758 and will be release in 0.17.2.

> IPC - Heartbeat exceptions filling up log files
> -----------------------------------------------
>
>                 Key: HADOOP-3802
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3802
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.17.1
>         Environment: Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:59:20 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux -- Fedora Core 4
>            Reporter: Dima Brodsky
>
> We have a datanode (10.0.0.93) that is in a semi-live state.  An ssh session is able to do a connect but is unable to send or receive any data.  The connection is closed immediately after the connection is established.  Because of this, the name node's logs are full of the following message:
> 2008-07-21 09:36:10,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1on 9000, call sendHeartbeat(10.0.0.193:50010, 917638877184, 20480, 644100882925, 0, 0) from 10.0.0.193:55908: error: org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
> org.apache.hadoop.dfs.IncorrectVersionException: Unexpected version of data node. Reported: -11. Expecting = -13.
>         at org.apache.hadoop.dfs.NameNode.verifyVersion(NameNode.java:682)
>         at org.apache.hadoop.dfs.NameNode.verifyRequest(NameNode.java:669)
>         at org.apache.hadoop.dfs.NameNode.sendHeartbeat(NameNode.java:557)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
> Approximately generating 100MB a minute.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.