You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2014/07/31 01:51:39 UTC

[jira] [Resolved] (HDFS-1384) NameNode should give client the first node in the pipeline from different rack other than that of excludedNodes list in the same rack.

     [ https://issues.apache.org/jira/browse/HDFS-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Allen Wittenauer resolved HDFS-1384.
------------------------------------

    Resolution: Incomplete

Closing as stale.

> NameNode should give client the first node in the pipeline from different rack  other than that of excludedNodes list in the same rack.
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1384
>                 URL: https://issues.apache.org/jira/browse/HDFS-1384
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.20.1, 0.20-append
>            Reporter: Thanh Do
>
> We saw a case that NN keeps giving client nodes from the same rack, hence an exception 
> from client when try to setup the pipeline. Client retries 5 times and fails.
>  
> Here is more details. Support we have 2 rack
> - Rack 0: from dn1 to dn7
> - Rack 1: from dn8 to dn14
> Client asks for 3 dns and NN replies with dn1, dn8 and dn9, for example.
> Because there is network partition, so client doesn't see any node in Rack 0.
> Hence, client add dn1 to excludedNodes list, and ask NN again.
> Interestingly, NN picks a different node (from those in excludedNodes) in Rack 0, 
> and gives back to client, and so on. Client keeps retrying and after 5 times of retrials, 
> write fails.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
> Haryadi Gunawi (haryadi@eecs.berkeley.edu)



--
This message was sent by Atlassian JIRA
(v6.2#6252)