You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Samuel Guo (JIRA)" <ji...@apache.org> on 2008/12/09 09:15:44 UTC

[jira] Created: (HAMA-130) Computing Block's range will miss some cell during blocking.

Computing Block's range will miss some cell during blocking.
------------------------------------------------------------

                 Key: HAMA-130
                 URL: https://issues.apache.org/jira/browse/HAMA-130
             Project: Hama
          Issue Type: Bug
          Components: implementation
    Affects Versions: 0.1.0
            Reporter: Samuel Guo
            Assignee: Samuel Guo
            Priority: Critical
             Fix For: 0.1.0


As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
if the block's size can integer-divide the matrix' column size, it is right.
but if the block's size can not integer-divide the matrix' column size, some cells will be missed.

for (int i = 0; i < mBlockNum; i++) {
        startColumn = i * mBlockColSize;
        endColumn = startColumn + mBlockColSize - 1;
        output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
            dv.subVector(startColumn, endColumn)));
}

for examples:
if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HAMA-130) Computing Block's range will miss some cell during blocking.

Posted by "Edward J. Yoon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655177#action_12655177 ] 

Edward J. Yoon commented on HAMA-130:
-------------------------------------

Oh, Great. +1

> Computing Block's range will miss some cell during blocking.
> ------------------------------------------------------------
>
>                 Key: HAMA-130
>                 URL: https://issues.apache.org/jira/browse/HAMA-130
>             Project: Hama
>          Issue Type: Bug
>          Components: implementation
>    Affects Versions: 0.1.0
>            Reporter: Samuel Guo
>            Assignee: Samuel Guo
>            Priority: Critical
>             Fix For: 0.1.0
>
>         Attachments: HAMA-130.patch
>
>
> As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
> if the block's size can integer-divide the matrix' column size, it is right.
> but if the block's size can not integer-divide the matrix' column size, some cells will be missed.
> for (int i = 0; i < mBlockNum; i++) {
>         startColumn = i * mBlockColSize;
>         endColumn = startColumn + mBlockColSize - 1;
>         output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
>             dv.subVector(startColumn, endColumn)));
> }
> for examples:
> if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
> then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HAMA-130) Computing Block's range will miss some cell during blocking.

Posted by "Edward J. Yoon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon updated HAMA-130:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Fixed.

> Computing Block's range will miss some cell during blocking.
> ------------------------------------------------------------
>
>                 Key: HAMA-130
>                 URL: https://issues.apache.org/jira/browse/HAMA-130
>             Project: Hama
>          Issue Type: Bug
>          Components: implementation
>    Affects Versions: 0.1.0
>            Reporter: Samuel Guo
>            Assignee: Samuel Guo
>            Priority: Critical
>             Fix For: 0.1.0
>
>         Attachments: HAMA-130.patch, HAMA-130_v01.patch
>
>
> As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
> if the block's size can integer-divide the matrix' column size, it is right.
> but if the block's size can not integer-divide the matrix' column size, some cells will be missed.
> for (int i = 0; i < mBlockNum; i++) {
>         startColumn = i * mBlockColSize;
>         endColumn = startColumn + mBlockColSize - 1;
>         output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
>             dv.subVector(startColumn, endColumn)));
> }
> for examples:
> if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
> then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HAMA-130) Computing Block's range will miss some cell during blocking.

Posted by "Samuel Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655152#action_12655152 ] 

Samuel Guo commented on HAMA-130:
---------------------------------

The Block Algorithm doesn't need that the size of all the blocks should be equals.
Due to the bugs in *BlockingMapper* and *SubMatrix*, the former implementation will go error during block-multiplication in some situations.

I think the patch solve the problem. And the block multiplication run correctly.



> Computing Block's range will miss some cell during blocking.
> ------------------------------------------------------------
>
>                 Key: HAMA-130
>                 URL: https://issues.apache.org/jira/browse/HAMA-130
>             Project: Hama
>          Issue Type: Bug
>          Components: implementation
>    Affects Versions: 0.1.0
>            Reporter: Samuel Guo
>            Assignee: Samuel Guo
>            Priority: Critical
>             Fix For: 0.1.0
>
>         Attachments: HAMA-130.patch
>
>
> As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
> if the block's size can integer-divide the matrix' column size, it is right.
> but if the block's size can not integer-divide the matrix' column size, some cells will be missed.
> for (int i = 0; i < mBlockNum; i++) {
>         startColumn = i * mBlockColSize;
>         endColumn = startColumn + mBlockColSize - 1;
>         output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
>             dv.subVector(startColumn, endColumn)));
> }
> for examples:
> if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
> then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HAMA-130) Computing Block's range will miss some cell during blocking.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655497#action_12655497 ] 

Hudson commented on HAMA-130:
-----------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12395794/HAMA-130_v01.patch
against trunk revision 725535.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 3 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hama-Patch/128/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hama-Patch/128/artifact/trunk/build/reports/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hama-Patch/128/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hama-Patch/128/console

This message is automatically generated.

> Computing Block's range will miss some cell during blocking.
> ------------------------------------------------------------
>
>                 Key: HAMA-130
>                 URL: https://issues.apache.org/jira/browse/HAMA-130
>             Project: Hama
>          Issue Type: Bug
>          Components: implementation
>    Affects Versions: 0.1.0
>            Reporter: Samuel Guo
>            Assignee: Samuel Guo
>            Priority: Critical
>             Fix For: 0.1.0
>
>         Attachments: HAMA-130.patch, HAMA-130_v01.patch
>
>
> As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
> if the block's size can integer-divide the matrix' column size, it is right.
> but if the block's size can not integer-divide the matrix' column size, some cells will be missed.
> for (int i = 0; i < mBlockNum; i++) {
>         startColumn = i * mBlockColSize;
>         endColumn = startColumn + mBlockColSize - 1;
>         output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
>             dv.subVector(startColumn, endColumn)));
> }
> for examples:
> if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
> then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HAMA-130) Computing Block's range will miss some cell during blocking.

Posted by "Edward J. Yoon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655127#action_12655127 ] 

Edward J. Yoon commented on HAMA-130:
-------------------------------------

Good catch, Samuel. patch is good.

BTW, I'm not sure whether the size of all blocks should be equals. we should check it.

> Computing Block's range will miss some cell during blocking.
> ------------------------------------------------------------
>
>                 Key: HAMA-130
>                 URL: https://issues.apache.org/jira/browse/HAMA-130
>             Project: Hama
>          Issue Type: Bug
>          Components: implementation
>    Affects Versions: 0.1.0
>            Reporter: Samuel Guo
>            Assignee: Samuel Guo
>            Priority: Critical
>             Fix For: 0.1.0
>
>         Attachments: HAMA-130.patch
>
>
> As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
> if the block's size can integer-divide the matrix' column size, it is right.
> but if the block's size can not integer-divide the matrix' column size, some cells will be missed.
> for (int i = 0; i < mBlockNum; i++) {
>         startColumn = i * mBlockColSize;
>         endColumn = startColumn + mBlockColSize - 1;
>         output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
>             dv.subVector(startColumn, endColumn)));
> }
> for examples:
> if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
> then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HAMA-130) Computing Block's range will miss some cell during blocking.

Posted by "Samuel Guo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Samuel Guo updated HAMA-130:
----------------------------

    Attachment: HAMA-130.patch

Attach my patch.

> Computing Block's range will miss some cell during blocking.
> ------------------------------------------------------------
>
>                 Key: HAMA-130
>                 URL: https://issues.apache.org/jira/browse/HAMA-130
>             Project: Hama
>          Issue Type: Bug
>          Components: implementation
>    Affects Versions: 0.1.0
>            Reporter: Samuel Guo
>            Assignee: Samuel Guo
>            Priority: Critical
>             Fix For: 0.1.0
>
>         Attachments: HAMA-130.patch
>
>
> As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
> if the block's size can integer-divide the matrix' column size, it is right.
> but if the block's size can not integer-divide the matrix' column size, some cells will be missed.
> for (int i = 0; i < mBlockNum; i++) {
>         startColumn = i * mBlockColSize;
>         endColumn = startColumn + mBlockColSize - 1;
>         output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
>             dv.subVector(startColumn, endColumn)));
> }
> for examples:
> if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
> then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HAMA-130) Computing Block's range will miss some cell during blocking.

Posted by "Samuel Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654784#action_12654784 ] 

Samuel Guo commented on HAMA-130:
---------------------------------

Also fix the bug from *SubMatrix*.

The bug from *SubMatrix* will fail the multiplication job during multiplying two matrices that are not square-matrices. Such as a(30 * 40) * b(40 * 30).

> Computing Block's range will miss some cell during blocking.
> ------------------------------------------------------------
>
>                 Key: HAMA-130
>                 URL: https://issues.apache.org/jira/browse/HAMA-130
>             Project: Hama
>          Issue Type: Bug
>          Components: implementation
>    Affects Versions: 0.1.0
>            Reporter: Samuel Guo
>            Assignee: Samuel Guo
>            Priority: Critical
>             Fix For: 0.1.0
>
>         Attachments: HAMA-130.patch
>
>
> As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
> if the block's size can integer-divide the matrix' column size, it is right.
> but if the block's size can not integer-divide the matrix' column size, some cells will be missed.
> for (int i = 0; i < mBlockNum; i++) {
>         startColumn = i * mBlockColSize;
>         endColumn = startColumn + mBlockColSize - 1;
>         output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
>             dv.subVector(startColumn, endColumn)));
> }
> for examples:
> if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
> then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HAMA-130) Computing Block's range will miss some cell during blocking.

Posted by "Edward J. Yoon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon updated HAMA-130:
--------------------------------

    Attachment: HAMA-130_v01.patch

> Computing Block's range will miss some cell during blocking.
> ------------------------------------------------------------
>
>                 Key: HAMA-130
>                 URL: https://issues.apache.org/jira/browse/HAMA-130
>             Project: Hama
>          Issue Type: Bug
>          Components: implementation
>    Affects Versions: 0.1.0
>            Reporter: Samuel Guo
>            Assignee: Samuel Guo
>            Priority: Critical
>             Fix For: 0.1.0
>
>         Attachments: HAMA-130.patch, HAMA-130_v01.patch
>
>
> As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
> if the block's size can integer-divide the matrix' column size, it is right.
> but if the block's size can not integer-divide the matrix' column size, some cells will be missed.
> for (int i = 0; i < mBlockNum; i++) {
>         startColumn = i * mBlockColSize;
>         endColumn = startColumn + mBlockColSize - 1;
>         output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
>             dv.subVector(startColumn, endColumn)));
> }
> for examples:
> if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
> then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HAMA-130) Computing Block's range will miss some cell during blocking.

Posted by "Edward J. Yoon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon updated HAMA-130:
--------------------------------

    Status: Patch Available  (was: Open)

Submit

> Computing Block's range will miss some cell during blocking.
> ------------------------------------------------------------
>
>                 Key: HAMA-130
>                 URL: https://issues.apache.org/jira/browse/HAMA-130
>             Project: Hama
>          Issue Type: Bug
>          Components: implementation
>    Affects Versions: 0.1.0
>            Reporter: Samuel Guo
>            Assignee: Samuel Guo
>            Priority: Critical
>             Fix For: 0.1.0
>
>         Attachments: HAMA-130.patch
>
>
> As the code below, in *BlockingMapper*, we compute the block's col range in a loop. 
> if the block's size can integer-divide the matrix' column size, it is right.
> but if the block's size can not integer-divide the matrix' column size, some cells will be missed.
> for (int i = 0; i < mBlockNum; i++) {
>         startColumn = i * mBlockColSize;
>         endColumn = startColumn + mBlockColSize - 1;
>         output.collect(new BlockID(blkRow, i), new VectorWritable(key.get(),
>             dv.subVector(startColumn, endColumn)));
> }
> for examples:
> if the block num is 3,  the matrix is 100 * 100. mBlockSize = 100 / 3 = 33.
> then <0 ~ 32>, <33 ~ 65>, <66 ~ 98> will be counted. At the same time we will lose column 99.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.