You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "stack (JIRA)" <ji...@apache.org> on 2007/12/28 00:06:52 UTC

[jira] Created: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

[hbase] minor performance improvements: Slim-down BatchOperation
----------------------------------------------------------------

                 Key: HADOOP-2495
                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
             Project: Hadoop
          Issue Type: Improvement
          Components: contrib/hbase
            Reporter: stack
            Priority: Minor
             Fix For: 0.16.0


A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-2495:
--------------------------

    Status: Patch Available  (was: Open)

Builds locally.

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HADOOP-2495:
-----------------------------

    Assignee: stack

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf-2.patch, perf-3.patch, perf-4.patch, perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12555002 ] 

Hadoop QA commented on HADOOP-2495:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12372317/perf-4.patch
against trunk revision r607330.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1434/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1434/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1434/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1434/console

This message is automatically generated.

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf-2.patch, perf-3.patch, perf-4.patch, perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-2495:
--------------------------

    Status: Patch Available  (was: In Progress)

Trying hudson.

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf-2.patch, perf-3.patch, perf-4.patch, perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-2495:
--------------------------

    Attachment: perf-4.patch

Bring local the RPC class and the ObjectWritable so can use the faster Text write/read String in place of UTF-8 for hbase RPC'ing and so can do other 'optimizations': e.g. sending codes rather than actual class names....

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf-2.patch, perf-3.patch, perf-4.patch, perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-2495:
--------------------------

    Attachment: perf-3.patch

Other minor improvements including adding a configuration for hlog sequence file compression.   Was doing RECORD level, the default from hadoop.  Set default to NONE.  It saves a few per-cent.  If folks want compressed log files, they'll probably want to do BLOCK level anyways presuming file is full of lots of small hbase edits.  



> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf-2.patch, perf-3.patch, perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-2495:
--------------------------

    Status: In Progress  (was: Patch Available)

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf-2.patch, perf-3.patch, perf-4.patch, perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-2495:
--------------------------

    Attachment: perf.patch

Commit message:

M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/io/TestTextSequence.java
    TextSequence no longer serializes.  Remove test for serialization.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreKey.java
    Wrap the text byte buffer once in a ByteBuffer rather than per character get.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchOperation.java
    Put this class on a diet. Its readField is responsible for most of the
    object creation when running the sequentialWrite experiment from PE.
    Removed the Operation enum; nice but not really needed.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchUpdate.java
    If put has a null value, throw exception (null value means DELETE).
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/TextSequence.java
    Make it so it no longer serializes.  I can't think of a good reason why
    you'd want to serialize a TextSequence; if its happening, TS is being
    misused.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    Call op.isPut instead of the removed op.getOp.

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-2495:
--------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed (TestLogRolling failed in test before this one and test after this one -- unrelated -- and a retest locally also completed successfully).  Resolving.

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf-2.patch, perf-3.patch, perf-4.patch, perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554866 ] 

Hadoop QA commented on HADOOP-2495:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12372287/perf-3.patch
against trunk revision r607330.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1428/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1428/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1428/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1428/console

This message is automatically generated.

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf-2.patch, perf-3.patch, perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2495) [hbase] minor performance improvements: Slim-down BatchOperation

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-2495:
--------------------------

    Attachment: perf-2.patch

Added some javadoc.

> [hbase] minor performance improvements: Slim-down BatchOperation
> ----------------------------------------------------------------
>
>                 Key: HADOOP-2495
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2495
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: perf-2.patch, perf.patch
>
>
> A couple of little improvements slimming down the hot BatchOperation.readFields method that is responsible for most object creations during a bulk update.  Also, make a ByteBuffer once searching column family rather than one per character get.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.