You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Eli Collins (Created) (JIRA)" <ji...@apache.org> on 2012/02/09 03:07:59 UTC

[jira] [Created] (HADOOP-8041) HA: log a warning when a failover is first attempted

HA: log a warning when a failover is first attempted 
-----------------------------------------------------

                 Key: HADOOP-8041
                 URL: https://issues.apache.org/jira/browse/HADOOP-8041
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: ha
    Affects Versions: HA Branch (HDFS-1623)
            Reporter: Eli Collins


Currently we always warn for each client operation made to a NN we've failed over to:

{noformat}
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
{noformat}

I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8041) HA: log a warning when a failover is first attempted

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205421#comment-13205421 ] 

Hudson commented on HADOOP-8041:
--------------------------------

Integrated in Hadoop-Hdfs-HAbranch-build #74 (See [https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/74/])
    HADOOP-8041. Log a warning when a failover is first attempted. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1242441
Files : 
* /hadoop/common/branches/HDFS-1623/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt
* /hadoop/common/branches/HDFS-1623/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java

                
> HA: log a warning when a failover is first attempted 
> -----------------------------------------------------
>
>                 Key: HADOOP-8041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA Branch (HDFS-1623)
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>             Fix For: HA Branch (HDFS-1623)
>
>         Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failedover to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8041) HA: log a warning when a failover is first attempted

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204250#comment-13204250 ] 

Suresh Srinivas commented on HADOOP-8041:
-----------------------------------------

I am not sure I understand this. For a long running client, once failover has occurred, the client knows which NN to talk to. In this case I am not sure why warning is logged in the first place! For a short lived client, say CLI, not sure how you can avoid this.
                
> HA: log a warning when a failover is first attempted 
> -----------------------------------------------------
>
>                 Key: HADOOP-8041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA Branch (HDFS-1623)
>            Reporter: Eli Collins
>
> Currently we always warn for each client operation made to a NN we've failed over to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8041) HA: log a warning when a failover is first attempted

Posted by "Eli Collins (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eli Collins updated HADOOP-8041:
--------------------------------

    Description: 
Currently we always warn for each client operation made to a NN we've failedover to:

{noformat}
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
{noformat}

I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.

  was:
Currently we always warn for each client operation made to a NN we've failed over to:

{noformat}
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
{noformat}

I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.

    
> HA: log a warning when a failover is first attempted 
> -----------------------------------------------------
>
>                 Key: HADOOP-8041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA Branch (HDFS-1623)
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failedover to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8041) HA: log a warning when a failover is first attempted

Posted by "Aaron T. Myers (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204697#comment-13204697 ] 

Aaron T. Myers commented on HADOOP-8041:
----------------------------------------

I agree with Todd's analysis. If a given failover proxy has never successfully made a call to the first NN it tries, no reason to print any info. If after having been successfully connected to an NN we then perform a client failover, that seems like info worth sharing with the user.

+1, the patch looks good to me.
                
> HA: log a warning when a failover is first attempted 
> -----------------------------------------------------
>
>                 Key: HADOOP-8041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA Branch (HDFS-1623)
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failedover to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8041) HA: log a warning when a failover is first attempted

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204274#comment-13204274 ] 

Todd Lipcon commented on HADOOP-8041:
-------------------------------------

My thinking is the following:
For a given proxy object, when we first have a successful RPC, we can set an internal flag in the failover proxy provider indicating that it has connected once. Then, whenever we do a failover, if that flag is set, then we should print a warning. Otherwise, only print it at DEBUG level.

This would allow FsShell commands to not print warnings due to an "already been failed over for a while" situation, but still cause an INFO msg to be printed in MR tasks or HBase servers if a failover takes place while they're accessing DFS.
                
> HA: log a warning when a failover is first attempted 
> -----------------------------------------------------
>
>                 Key: HADOOP-8041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA Branch (HDFS-1623)
>            Reporter: Eli Collins
>
> Currently we always warn for each client operation made to a NN we've failed over to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8041) HA: log a warning when a failover is first attempted

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-8041:
--------------------------------

    Attachment: hadoop-8041.txt

Attached patch fixes the issue.
I didn't edit unit tests since this is just a logging change, but I verified on a cluster that when the second NN is active, FsShell commands no longer print warnings. If I put both NNs in standby mode, I get warnings as it flip-flops back and forth trying to find an active one.
                
> HA: log a warning when a failover is first attempted 
> -----------------------------------------------------
>
>                 Key: HADOOP-8041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA Branch (HDFS-1623)
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failed over to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HADOOP-8041) HA: log a warning when a failover is first attempted

Posted by "Todd Lipcon (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon reassigned HADOOP-8041:
-----------------------------------

    Assignee: Todd Lipcon
    
> HA: log a warning when a failover is first attempted 
> -----------------------------------------------------
>
>                 Key: HADOOP-8041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA Branch (HDFS-1623)
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>
> Currently we always warn for each client operation made to a NN we've failed over to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HADOOP-8041) HA: log a warning when a failover is first attempted

Posted by "Todd Lipcon (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon resolved HADOOP-8041.
---------------------------------

       Resolution: Fixed
    Fix Version/s: HA Branch (HDFS-1623)
     Hadoop Flags: Reviewed

Committed to branch, thanks.
                
> HA: log a warning when a failover is first attempted 
> -----------------------------------------------------
>
>                 Key: HADOOP-8041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA Branch (HDFS-1623)
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>             Fix For: HA Branch (HDFS-1623)
>
>         Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failedover to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira