You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2008/05/27 18:26:02 UTC

[jira] Created: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Add some more hints of the problem when datanode and namenode don't match
-------------------------------------------------------------------------

                 Key: HADOOP-3448
                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
             Project: Hadoop Core
          Issue Type: Improvement
          Components: documentation
    Affects Versions: 0.18.0
            Reporter: Steve Loughran
            Priority: Minor
         Attachments: hadoop-3448.patch

When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";

However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600646#action_12600646 ] 

Konstantin Shvachko commented on HADOOP-3448:
---------------------------------------------

I also noticed that if you don't do a clean build the new constants will not be picked up everywhere.
This probably has something to do with dependencies in our build.xml.

> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Priority: Minor
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran updated HADOOP-3448:
-----------------------------------

    Status: Patch Available  (was: Open)

patch available.

> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Priority: Minor
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600639#action_12600639 ] 

Steve Loughran commented on HADOOP-3448:
----------------------------------------

The problem has 'gone away'; I think forcing clean builds through fixed it. Given I was running the same version everywhere, I was somewhat surprised too.

two possible causes
 -classloader/dual version on path (ivy is in charge of CP setup; it should have caught this)
 -cached versions of constants not being picked up
Although Java is fairly smart with dependencies, constant primitive types get copied around at compile time, so can be out of date if something changes. 

anyway, the build is fixed. Should the patch be needed? Well, maybe the text could be clearer or just point to a wiki page discussing this issue. 

> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Priority: Minor
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran updated HADOOP-3448:
-----------------------------------

    Attachment: hadoop-3448.patch

patch to make the assertion more detailed
      "Data-node and name-node layout versions must be the same."
      + "Expected: "+ FSConstants.LAYOUT_VERSION + " actual "+ nsInfo.getLayoutVersion();


> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Priority: Minor
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600704#action_12600704 ] 

Steve Loughran commented on HADOOP-3448:
----------------------------------------

Its probably because the .class files dont even track where constants came from. Change a constant: do a clean build. 

Perhaps we should change the assert to say this. 

> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Priority: Minor
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-3448:
----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.18.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Steve!

> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>             Fix For: 0.18.0
>
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3448:
----------------------------------------

    Component/s:     (was: documentation)
                 dfs

> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Priority: Minor
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler reassigned HADOOP-3448:
---------------------------------------

    Assignee: Steve Loughran

Assigning to whomever submitted a patch so as to better manage committing things that are ready for prime time.

> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600562#action_12600562 ] 

Konstantin Shvachko commented on HADOOP-3448:
---------------------------------------------

This should actually never happen. Something is wrong with your build. Here is why.
DataNode.handshake() receives version information from the name-node and verifies it against its own versions.
And the first thing that is verified is the build version. BV is reset every time you re-build hadoop, while LAYOUT_VERSION
changes only if its is actually changed in the code. 
So, if the LAYOUT_VERSIONs are different then the build versions must be different too.
The assert you mentioned is really an assert and should not be used for diagnostic purposes.
May be the problem is somewhere else. Could you please post here the (different) layout versions and (presumably equal) build versions.

> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Priority: Minor
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3448) Add some more hints of the problem when datanode and namenode don't match

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600182#action_12600182 ] 

Steve Loughran commented on HADOOP-3448:
----------------------------------------

The patch I've submitted adds the version information, so the the stack trace can be more useful than :

java.lang.AssertionError: Data-node and name-node layout versions must be the same.,

However, one thing to consider is whether this should be checked every time the datanode starts up, rather than skipping it if -ea is disabled. Otherwise there is a risk that the problem will not be picked up in production. 


> Add some more hints of the problem when datanode and namenode don't match
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3448
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Priority: Minor
>         Attachments: hadoop-3448.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> When there is a mismatch between name and data mode, and you are running with -ea set, then Datanode.handshake() bails out with an assertion  "Data-node and name-node layout versions must be the same.";
> However, this message doesnt actually say which version numbers are at fault. A better error message would include the version information, so pointing the finger of blame would be easier.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.