You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Scott Chen (JIRA)" <ji...@apache.org> on 2010/06/01 21:01:44 UTC

[jira] Created: (MAPREDUCE-1831) Delete the replica on the most concentrated node when raiding file

Delete the replica on the most concentrated node when raiding file
------------------------------------------------------------------

                 Key: MAPREDUCE-1831
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: contrib/raid
    Affects Versions: 0.22.0
            Reporter: Scott Chen
            Assignee: Scott Chen
             Fix For: 0.22.0


In raid, it is good to have the blocks on the same stripe located on different machine.
This way when one machine is down, it does not broke two blocks on the stripe.
By doing this, we can decrease the block error probability in raid from O(p^3) to O(p^4) which can be a hugh improvement.

One way to do this is that we can add a new BlockPlacementPolicy which delete the replicas that are co-located.
So when raiding the file, we can make the remaining replicas live on different machines.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.