You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/06/14 00:09:31 UTC

[jira] Commented: (HADOOP-297) When selecting node to put new block on, give priority to those with more free space/less blocks

    [ http://issues.apache.org/jira/browse/HADOOP-297?page=comments#action_12416089 ] 

Doug Cutting commented on HADOOP-297:
-------------------------------------

This sounds like a reasonable approach to a real problem.  Has anyone tested this on a large cluster?

I worry about performance.  The sort comparator doesn't need to create Long instances, but could instead simply have long variables, no?  Also, the calls to input.remove() make the algorithm quadratic, since each call to Vector.remove() copies all the entries after it.

> When selecting node to put new block on, give priority to those with more free space/less blocks
> ------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-297
>          URL: http://issues.apache.org/jira/browse/HADOOP-297
>      Project: Hadoop
>         Type: Improvement

>   Components: dfs
>     Versions: 0.3.2
>     Reporter: Johan Oskarson
>     Priority: Minor
>  Attachments: priorityshuffle_v1.patch
>
> As mentioned in previous bug report:
> We're running a smallish cluster with very different machines, some with only 60 gb harddrives
> This creates a problem when inserting files into the dfs, these machines run out of space quickly while some have plenty of space free.
> So instead of just shuffling the nodes, I've created a quick patch that first sorts the target nodes by (freespace / blocks).
> It then randomizes the position of the first third of the nodes (so we don't put all the blocks in the file on the same machine)
> I'll let you guys figure out how to improve this.
> /Johan

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira