You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2007/05/31 20:31:15 UTC

[jira] Created: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Setting the replication factor of a file too high causes namenode cpu overload
------------------------------------------------------------------------------

                 Key: HADOOP-1448
                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
            Reporter: dhruba borthakur


The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.

One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.

This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510233 ] 

dhruba borthakur commented on HADOOP-1448:
------------------------------------------

I like this algorithm. This will address the case when  a non-dfs client is accessing a file and the first location it gets is always a random replica. 

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-1448:
----------------------------------

    Attachment: getBlockLocation.patch

This patch is a little simpler than what I have proposed.
!. The returned list contains all the replica locations.
2. In the returned loction list, a local replica is followed by a local-rack replica, then followed by the rest replicas.
3. If there is not any local replica or local-rack replicas, the first in the list is a random replica.

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>         Attachments: getBlockLocation.patch
>
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505457 ] 

Hadoop QA commented on HADOOP-1448:
-----------------------------------

Integrated in Hadoop-Nightly #123 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/123/])

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-1448:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Hairong!

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: dhruba borthakur
>            Assignee: Hairong Kuang
>             Fix For: 0.14.0
>
>         Attachments: getBlockLocation.patch
>
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512056 ] 

Hudson commented on HADOOP-1448:
--------------------------------

Integrated in Hadoop-Nightly #152 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/152/])

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: dhruba borthakur
>            Assignee: Hairong Kuang
>             Fix For: 0.14.0
>
>         Attachments: getBlockLocation.patch
>
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505932 ] 

Nigel Daley commented on HADOOP-1448:
-------------------------------------

The comment from HadoopQA that this was intregrated in build #123 is erroneous.  I believe the changelog indicated this Jira, but should have had HADOOP-1449 instead.

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-1448:
----------------------------------

        Fix Version/s: 0.14.0
    Affects Version/s: 0.13.0
               Status: Patch Available  (was: Open)

I make the patch available expecting for review coments.

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: dhruba borthakur
>            Assignee: Hairong Kuang
>             Fix For: 0.14.0
>
>         Attachments: getBlockLocation.patch
>
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508684 ] 

dhruba borthakur commented on HADOOP-1448:
------------------------------------------

This problem becomes worse when there are map-reduce nodes that do not run "dfs". *All* accesses from these nodes go to the first replica of every block. 

One proposal is that if the client does not belong to a dfs cluster, then the getBlockLocations call returns all replicas in somewhat *random* order. 

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511298 ] 

dhruba borthakur commented on HADOOP-1448:
------------------------------------------

+1. Code looks good. One minor comment: can we enhance the unit test to test the case that a "random" node is returned when there are no local rack or local node available? Maybe we can invoke pseudoSortByDistance multiple times and verify that at least some invocations return different nodes as the first entry in the list.

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: dhruba borthakur
>            Assignee: Hairong Kuang
>             Fix For: 0.14.0
>
>         Attachments: getBlockLocation.patch
>
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang reassigned HADOOP-1448:
-------------------------------------

    Assignee: Hairong Kuang

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: Hairong Kuang
>         Attachments: getBlockLocation.patch
>
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510802 ] 

Hadoop QA commented on HADOOP-1448:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12361243/getBlockLocation.patch applied and successfully tested against trunk revision r553623.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/369/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/369/console

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: dhruba borthakur
>            Assignee: Hairong Kuang
>             Fix For: 0.14.0
>
>         Attachments: getBlockLocation.patch
>
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500458 ] 

Raghu Angadi commented on HADOOP-1448:
--------------------------------------

Does this test have patch from HADOOP-1155? May be this would be good test to show perf benefits of 1155, it is currently not committed because of lack of proof. pseudo-sort in HADOOP-1155 can easily be extended to find only 3 'best' nodes (if it is not already doing it). This doesn't even require full linear scan of all 300 replicas.




> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501313 ] 

dhruba borthakur commented on HADOOP-1448:
------------------------------------------

In the short term, I like Doug's proposal #1. The current code already supports a configuration setting named dfs.replication.max. It is set to 512.

For installations that are seeing this problem, maybe they can change dfs.replication.max to a smaller number, say 10 - 20 or so. An application that was successfully able to set a replication factor of 512 with the current configuration will now experience an exception when dfs.replication.max is set to 20.

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500452 ] 

Doug Cutting commented on HADOOP-1448:
--------------------------------------

Other ideas:

1. Limit the maximum replication to something reasonably small.

2. When a file is highly-replicated, sample the replicas rather than sorting and returning the full list.  For example, one could first try to return only the first-N replicas on the same rack as the client.  If that fails, then the off-rack replicas might be randomly sampled.

Do we think it is reasonable to set replication to numbers proportional to the cluster size?  If so, do we think it is reasonable for namenode requests to take time proportional to the cluster size?  If not, then, in these cases, we should try to never return or process the full list, but rather only a fixed-size subset of it.

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1448) Setting the replication factor of a file too high causes namenode cpu overload

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509953 ] 

Hairong Kuang commented on HADOOP-1448:
---------------------------------------

For the getBlockLocation operation, how about the following improvements?
1. For each block, the maximun number of replica locations returned is 3.
2. Instead of sorting, using the pseudo-sort in HADOOP-1155.  The returned list is in the order: local replica, local-rack replicas, and off-rack replicas. This algorithm does not require full linear scan of all replicas.
3. For the randomness, if there is a local copy, the first location is the local copy; otherwise if there are any local-rack copies, set the first location to be a random local-rack replica; otherwise, the first location is a random off-rack replica.

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then all mappers try to open this file. For every open call that the namenode receives from each of these 800 clients, it sorts all the replicas of the block(s) based on the distance from the client. This causes CPU usage overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to the client. Information about each replica also contains the rack on which that replica resides. The client can look at the replicas to determine if there is a copy on the local node. If not, then it can find out if there is a replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.