You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2014/11/12 20:51:34 UTC

[jira] [Commented] (HADOOP-7397) Allow configurable timeouts when connecting to HDFS via java FileSystem API

    [ https://issues.apache.org/jira/browse/HADOOP-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208554#comment-14208554 ] 

Hadoop QA commented on HADOOP-7397:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546174/HADOOP-7397.patch
  against trunk revision 782abbb.

    {color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5069//console

This message is automatically generated.

> Allow configurable timeouts when connecting to HDFS via java FileSystem API
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-7397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7397
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Scott Fines
>            Priority: Minor
>              Labels: hadoop
>         Attachments: HADOOP-7397.patch, timeout.patch
>
>
> If the NameNode is not available (in, for example, a network partition event separating the client from the NameNode), and an attempt is made to connect, then the FileSystem api will *eventually* timeout and throw an error. However, that timeout is currently hardcoded to be 20 seconds to connect, with 45 retries, for a total of a 15 minute wait before failure. While in many circumstances this is fine, there are also many circumstances (such as booting a service) where both the connection timeout and the number of retries should be significantly less, so as not to harm availability of other services.
> Investigating Client.java, I see that there are two fields in Connection: maxRetries and rpcTimeout. I propose either re-using those fields for initiating the connection as well; alternatively, using the already existing dfs.socket.timeout parameter to set the connection timeout on initialization, and potentially adding a new field such as dfs.connection.retries with a default of 45 to replicate current behaviors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)