You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Junping Du (JIRA)" <ji...@apache.org> on 2012/06/04 07:44:23 UTC

[jira] [Created] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Junping Du created HADOOP-8473:
----------------------------------

             Summary: Update Balancer to support new NetworkTopology with NodeGroup
                 Key: HADOOP-8473
                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: util
    Affects Versions: 2.0.0-alpha, 1.0.0
            Reporter: Junping Du
            Assignee: Junping Du


Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Posted by "Junping Du (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Junping Du updated HADOOP-8473:
-------------------------------

    Status: Patch Available  (was: Open)

This patch has dependency on HADOOP-8469, and should be checked in after HADOOP-8469 
                
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HADOOP-8473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: util
>    Affects Versions: 2.0.0-alpha, 1.0.0
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HADOOP-8473-Balancer-NodeGroup-aware.patch
>
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Posted by "Junping Du (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Junping Du updated HADOOP-8473:
-------------------------------

    Issue Type: Bug  (was: Sub-task)
        Parent:     (was: HADOOP-8468)
    
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HADOOP-8473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HADOOP-8473-Balancer-NodeGroup-aware.patch
>
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Posted by "Sanjay Radia (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288676#comment-13288676 ] 

Sanjay Radia commented on HADOOP-8473:
--------------------------------------

There are two separate problems here as mentioned in your description  - please split into two separate jiras:
* correctness - two replicas are not on the same node
* performance optimization - "choose the target and source node on the same node group for balancing as the first priority".
                
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HADOOP-8473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: util
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HADOOP-8473-Balancer-NodeGroup-aware.patch
>
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Posted by "timer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288365#comment-13288365 ] 

timer commented on HADOOP-8473:
-------------------------------

How to keep the Hadoop Data availablility if the node group is as large as the whole cluster?
                
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HADOOP-8473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: util
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Posted by "timer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288362#comment-13288362 ] 

timer commented on HADOOP-8473:
-------------------------------

The same block is impossible to be on the same node. They have the same file name on the host File System. You may not move the block from a overloaded node to a less-loaded node. It will report your error.

                
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HADOOP-8473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: util
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Posted by "Junping Du (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288441#comment-13288441 ] 

Junping Du commented on HADOOP-8473:
------------------------------------

Hi Timer,
   Thanks for the comments. NodeGroup layer is trying to map to hypervisor layer in virtualization environment. So if whole cluster is only a single physical host with a bunch of VMs running on top of, then nothing can help on data availability. In fact, enhancing data availability is one reason we want to deliver in this JIRA Umbrella (HADOOP-8468) for hadoop running in cloud (with a high percentage built on virtualized platform). i.e: We want to let hadoop can aware "NodeGroup" layer so that not putting two replicas on two VMs running on the same physical host. Thoughts?

Thanks,

Junping
                
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HADOOP-8473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: util
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Posted by "Junping Du (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Junping Du updated HADOOP-8473:
-------------------------------

    Status: Open  (was: Patch Available)
    
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HADOOP-8473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: util
>    Affects Versions: 2.0.0-alpha, 1.0.0
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HADOOP-8473-Balancer-NodeGroup-aware.patch
>
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Posted by "Junping Du (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288697#comment-13288697 ] 

Junping Du commented on HADOOP-8473:
------------------------------------

Hey Sanjay, Thanks for the comments. Will update soon.

Best,

Junping
                
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HADOOP-8473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: util
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HADOOP-8473-Balancer-NodeGroup-aware.patch
>
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8473) Update Balancer to support new NetworkTopology with NodeGroup

Posted by "Junping Du (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Junping Du updated HADOOP-8473:
-------------------------------

    Attachment: HADOOP-8473-Balancer-NodeGroup-aware.patch

This patch depends on HADOOP-8439.
                
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HADOOP-8473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8473
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: util
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HADOOP-8473-Balancer-NodeGroup-aware.patch
>
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of four-layer hierarchy instead of creating an alternative Balancer implementation. To accommodate extensibility, a new protected method, doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes method so that a subclass of the Balancer could customize the balancer algotirhm for other failure and locality topologies. An alternative option is to encapsulate the algorithm used for the four-layer hierarchy into a collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override the algorithm of choosing <source, target> pairs for balancing. Unit tests were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same node group for balancing as the first priority. Then the overall balancing policy is: first doing balancing between nodes within the same nodegroup then the same rack and off rack at last. Also, we need to check no duplicated replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira