You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Dhiraj Hegde (Jira)" <ji...@apache.org> on 2020/05/26 06:22:00 UTC

[jira] [Created] (HADOOP-17052) NetUtils.connect() throws an exception the prevents any retries when hostname resolution fails

Dhiraj Hegde created HADOOP-17052:
-------------------------------------

             Summary: NetUtils.connect() throws an exception the prevents any retries when hostname resolution fails
                 Key: HADOOP-17052
                 URL: https://issues.apache.org/jira/browse/HADOOP-17052
             Project: Hadoop Common
          Issue Type: Bug
          Components: hdfs-client
    Affects Versions: 3.1.3, 3.2.1, 2.9.2, 2.10.0
            Reporter: Dhiraj Hegde


Hadoop components are increasingly being deployed on VMs and containers. One aspect of this environment is that DNS is dynamic. Hostname records get modified (or deleted/recreated) as a container in Kubernetes (or even VM) is being created/recreated. In such dynamic environments, the initial DNS resolution request might return resolution failure briefly as DNS client doesn't always get the latest records. This has been observed in Kubernetes in particular. In such cases NetUtils.connect() appears to throw java.nio.channels.UnresolvedAddressException.  In much of Hadoop code (like DFSInputStream and DFSOutputStream), the code is designed to retry IOException. However, since UnresolvedAddressException is not child of IOException, no retry happens and the code aborts immediately. It is much better if NetUtils.connect() throws java.net.UnknownHostException as that is derived from IOException and the code will treat this as a retry-able error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org