You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Eli Collins (Created) (JIRA)" <ji...@apache.org> on 2012/02/09 03:07:59 UTC
[jira] [Created] (HADOOP-8041) HA: log a warning when a failover is
first attempted
HA: log a warning when a failover is first attempted
-----------------------------------------------------
Key: HADOOP-8041
URL: https://issues.apache.org/jira/browse/HADOOP-8041
Project: Hadoop Common
Issue Type: Sub-task
Components: ha
Affects Versions: HA Branch (HDFS-1623)
Reporter: Eli Collins
Currently we always warn for each client operation made to a NN we've failed over to:
{noformat}
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
{noformat}
I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8041) HA: log a warning when a failover
is first attempted
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205421#comment-13205421 ]
Hudson commented on HADOOP-8041:
--------------------------------
Integrated in Hadoop-Hdfs-HAbranch-build #74 (See [https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/74/])
HADOOP-8041. Log a warning when a failover is first attempted. Contributed by Todd Lipcon.
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1242441
Files :
* /hadoop/common/branches/HDFS-1623/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt
* /hadoop/common/branches/HDFS-1623/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java
> HA: log a warning when a failover is first attempted
> -----------------------------------------------------
>
> Key: HADOOP-8041
> URL: https://issues.apache.org/jira/browse/HADOOP-8041
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA Branch (HDFS-1623)
> Reporter: Eli Collins
> Assignee: Todd Lipcon
> Fix For: HA Branch (HDFS-1623)
>
> Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failedover to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8041) HA: log a warning when a failover
is first attempted
Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204250#comment-13204250 ]
Suresh Srinivas commented on HADOOP-8041:
-----------------------------------------
I am not sure I understand this. For a long running client, once failover has occurred, the client knows which NN to talk to. In this case I am not sure why warning is logged in the first place! For a short lived client, say CLI, not sure how you can avoid this.
> HA: log a warning when a failover is first attempted
> -----------------------------------------------------
>
> Key: HADOOP-8041
> URL: https://issues.apache.org/jira/browse/HADOOP-8041
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA Branch (HDFS-1623)
> Reporter: Eli Collins
>
> Currently we always warn for each client operation made to a NN we've failed over to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8041) HA: log a warning when a failover is
first attempted
Posted by "Eli Collins (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eli Collins updated HADOOP-8041:
--------------------------------
Description:
Currently we always warn for each client operation made to a NN we've failedover to:
{noformat}
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
{noformat}
I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.
was:
Currently we always warn for each client operation made to a NN we've failed over to:
{noformat}
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
{noformat}
I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.
> HA: log a warning when a failover is first attempted
> -----------------------------------------------------
>
> Key: HADOOP-8041
> URL: https://issues.apache.org/jira/browse/HADOOP-8041
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA Branch (HDFS-1623)
> Reporter: Eli Collins
> Assignee: Todd Lipcon
> Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failedover to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8041) HA: log a warning when a failover
is first attempted
Posted by "Aaron T. Myers (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204697#comment-13204697 ]
Aaron T. Myers commented on HADOOP-8041:
----------------------------------------
I agree with Todd's analysis. If a given failover proxy has never successfully made a call to the first NN it tries, no reason to print any info. If after having been successfully connected to an NN we then perform a client failover, that seems like info worth sharing with the user.
+1, the patch looks good to me.
> HA: log a warning when a failover is first attempted
> -----------------------------------------------------
>
> Key: HADOOP-8041
> URL: https://issues.apache.org/jira/browse/HADOOP-8041
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA Branch (HDFS-1623)
> Reporter: Eli Collins
> Assignee: Todd Lipcon
> Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failedover to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8041) HA: log a warning when a failover
is first attempted
Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204274#comment-13204274 ]
Todd Lipcon commented on HADOOP-8041:
-------------------------------------
My thinking is the following:
For a given proxy object, when we first have a successful RPC, we can set an internal flag in the failover proxy provider indicating that it has connected once. Then, whenever we do a failover, if that flag is set, then we should print a warning. Otherwise, only print it at DEBUG level.
This would allow FsShell commands to not print warnings due to an "already been failed over for a while" situation, but still cause an INFO msg to be printed in MR tasks or HBase servers if a failover takes place while they're accessing DFS.
> HA: log a warning when a failover is first attempted
> -----------------------------------------------------
>
> Key: HADOOP-8041
> URL: https://issues.apache.org/jira/browse/HADOOP-8041
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA Branch (HDFS-1623)
> Reporter: Eli Collins
>
> Currently we always warn for each client operation made to a NN we've failed over to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8041) HA: log a warning when a failover is
first attempted
Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon updated HADOOP-8041:
--------------------------------
Attachment: hadoop-8041.txt
Attached patch fixes the issue.
I didn't edit unit tests since this is just a logging change, but I verified on a cluster that when the second NN is active, FsShell commands no longer print warnings. If I put both NNs in standby mode, I get warnings as it flip-flops back and forth trying to find an active one.
> HA: log a warning when a failover is first attempted
> -----------------------------------------------------
>
> Key: HADOOP-8041
> URL: https://issues.apache.org/jira/browse/HADOOP-8041
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA Branch (HDFS-1623)
> Reporter: Eli Collins
> Assignee: Todd Lipcon
> Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failed over to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HADOOP-8041) HA: log a warning when a failover
is first attempted
Posted by "Todd Lipcon (Assigned) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon reassigned HADOOP-8041:
-----------------------------------
Assignee: Todd Lipcon
> HA: log a warning when a failover is first attempted
> -----------------------------------------------------
>
> Key: HADOOP-8041
> URL: https://issues.apache.org/jira/browse/HADOOP-8041
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA Branch (HDFS-1623)
> Reporter: Eli Collins
> Assignee: Todd Lipcon
>
> Currently we always warn for each client operation made to a NN we've failed over to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-8041) HA: log a warning when a failover
is first attempted
Posted by "Todd Lipcon (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon resolved HADOOP-8041.
---------------------------------
Resolution: Fixed
Fix Version/s: HA Branch (HDFS-1623)
Hadoop Flags: Reviewed
Committed to branch, thanks.
> HA: log a warning when a failover is first attempted
> -----------------------------------------------------
>
> Key: HADOOP-8041
> URL: https://issues.apache.org/jira/browse/HADOOP-8041
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA Branch (HDFS-1623)
> Reporter: Eli Collins
> Assignee: Todd Lipcon
> Fix For: HA Branch (HDFS-1623)
>
> Attachments: hadoop-8041.txt
>
>
> Currently we always warn for each client operation made to a NN we've failedover to:
> {noformat}
> hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr /
> 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
> {noformat}
> I'm going to remove this warning in HDFS-2922 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning when the client first does a failover so it shows up eg in the MR and HBase logs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira