You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2009/03/31 23:18:51 UTC

[jira] Created: (HADOOP-5603) Improve block placement performance

Improve block placement performance
-----------------------------------

                 Key: HADOOP-5603
                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
             Project: Hadoop Core
          Issue Type: Improvement
          Components: dfs
            Reporter: Hairong Kuang
            Assignee: Hairong Kuang
             Fix For: 0.21.0


ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5603) Improve block placement performance

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695524#action_12695524 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5603:
------------------------------------------------

+1 patch looks good.

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5603) Improve block placement performance

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5603:
----------------------------------

    Attachment: blockPlace.patch

Here is the patch that made the suggested change.

with the patch, both ReplicationTargetChooser#chooseRandom(int, String, List<Node>, long, int, List<DatanodeDescriptor>) and ReplicationTarget#chooseRandom(String, List<Node>, long, int, List<DatanodeDescriptor>) traverse every node in the given portion of the cluster map at most once in the worst case.

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5603) Improve block placement performance

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694303#action_12694303 ] 

Hairong Kuang commented on HADOOP-5603:
---------------------------------------

I did an experiment in a dfs cluster with 3150 node. The cluster is full with no space to place any block. The trunk takes around 6.5s to declare failure in an effort to place a block to 2 nodes. With the patch, it takes around 2.1s to declare failure.

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5603) Improve block placement performance

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5603:
----------------------------------

    Attachment: blockPlace1.patch

This patch fixed a bug that caused some unit tests to fail.

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch, blockPlace1.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5603) Improve block placement performance

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5603:
----------------------------------

    Hadoop Flags: [Reviewed]
          Status: Patch Available  (was: Open)

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5603) Improve block placement performance

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695634#action_12695634 ] 

Hadoop QA commented on HADOOP-5603:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12404270/blockPlace.patch
  against trunk revision 761632.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/113/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/113/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/113/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/113/console

This message is automatically generated.

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5603) Improve block placement performance

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5603:
----------------------------------

    Attachment: blockPlace1.patch

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch, blockPlace1.patch, blockPlace1.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5603) Improve block placement performance

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5603:
----------------------------------

    Status: Open  (was: Patch Available)

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5603) Improve block placement performance

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696355#action_12696355 ] 

Hadoop QA commented on HADOOP-5603:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12404772/blockPlace1.patch
  against trunk revision 762509.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/156/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/156/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/156/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/156/console

This message is automatically generated.

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch, blockPlace1.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5603) Improve block placement performance

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5603:
----------------------------------

    Attachment:     (was: blockPlace1.patch)

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch, blockPlace1.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5603) Improve block placement performance

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5603:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

-1 Eclipse classpath is caused by HADOOP-5518 not by this patch. The change is covered by existing test cases. There is no need of new tests.

I've committed this!

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch, blockPlace1.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5603) Improve block placement performance

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5603:
----------------------------------

    Status: Patch Available  (was: Open)

> Improve block placement performance
> -----------------------------------
>
>                 Key: HADOOP-5603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5603
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockPlace.patch, blockPlace1.patch
>
>
> ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map.  This code can be improved by traversing the portion of the cluster map only once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.