You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2008/07/03 20:31:45 UTC

[jira] Issue Comment Edited: (HADOOP-3685) Unbalanced replication target

    [ https://issues.apache.org/jira/browse/HADOOP-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610295#action_12610295 ] 

hairong edited comment on HADOOP-3685 at 7/3/08 11:30 AM:
----------------------------------------------------------------

This patch places a third replica on the rack where the source is located in case of rereplication when two existing replicas are on two different racks. Since the source and the target are at the same rack, only the datanodes on the same rack may choose to replicate an underreplicated blocks to this rack. Therefore at most twice of (rack size-1) block transfers may happen to a single target within a heartbeat interval.

      was (Author: hairong):
    This patch places a third replica on the rack where the source is located in case of rereplication when two existing replicas are on two different racks.
  
> Unbalanced replication target 
> ------------------------------
>
>                 Key: HADOOP-3685
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3685
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.0
>            Reporter: Koji Noguchi
>            Assignee: Hairong Kuang
>            Priority: Critical
>         Attachments: rereplicationPolicy.patch
>
>
> In HADOOP-3633, namenode was assigning some datanodes to receive  hundreds of blocks in a short period which caused datanodes to go out of memroy(threads).
> Most of them were from remote rack.
> Looking at the code, 
> {noformat}
>     166           chooseLocalRack(results.get(1), excludedNodes, blocksize,
>     167                           maxNodesPerRack, results);
> {noformat}
> was sometimes not choosing the local rack of the writer(source).  
> As a result, when a datanode goes down, other datanodes on the same rack were getting large number of blocks from remote racks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.