You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2012/08/21 01:49:38 UTC

[jira] [Created] (HBASE-6621) Reduce calls to Bytes.toInt

Lars Hofhansl created HBASE-6621:
------------------------------------

             Summary: Reduce calls to Bytes.toInt
                 Key: HBASE-6621
                 URL: https://issues.apache.org/jira/browse/HBASE-6621
             Project: HBase
          Issue Type: Bug
            Reporter: Lars Hofhansl
            Assignee: Lars Hofhansl


Bytes.toInt shows up quite often in a profiler run.
It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().

Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.

In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438864#comment-13438864 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

Thanks for the reviews.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438351#comment-13438351 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

The 2nd observation is that since we also know the keyLength of the KV already in HFileReaderV2, we might as well pass it to the created KV, further reducing calls to Bytes.toInt().

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438415#comment-13438415 ] 

Zhihong Ted Yu commented on HBASE-6621:
---------------------------------------

{code}
+   * for length <code>length</code>, and a know <code>keyLength</code>.
{code}
'know' -> 'known'
{code}
+   * Use with caution.
{code}
Can you tell us what caution should be taken ?
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6621:
---------------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6621:
---------------------------------

    Status: Patch Available  (was: Open)

I doubt this is going to break anything, but let's get a hadoop qa run.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448326#comment-13448326 ] 

Hudson commented on HBASE-6621:
-------------------------------

Integrated in HBase-0.94-security-on-Hadoop-23 #7 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/7/])
    HBASE-6621 Reduce calls to Bytes.toInt (Revision 1375665)

     Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6621:
---------------------------------

    Attachment: 6621-0.96.txt

Simple patch.
currKeyLen and currValueLen are up to date (and they are in fact used in the getKey() and getValue() methods.
This patch uses them in getKeyValue() as well, and hence avoids two Bytes.toInt conversion for each read KV during scanning.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439145#comment-13439145 ] 

Hudson commented on HBASE-6621:
-------------------------------

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #141 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/141/])
    HBASE-6621 Reduce calls to Bytes.toInt (Revision 1375663)

     Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438445#comment-13438445 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

@Todd: I assume you are referring to v3 of the patch? (the part where I suggest caching the type of the KV and a new byte member)  I had the same about this just being a bogus profiler indicator.

v4 (and v2) does not cache information, it just uses the information that the HFileScanner already collected anyway (length and key length) and avoids calculating it again - it cannot make things worse. I saw a real life improvement with it. I'll quantify it.

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438379#comment-13438379 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

Thinking now that maybe the v3 portion should be separate as it actually changes the storage patterns of KeyValue. The changes in v2 cannot lead to worse performance than before in any scenario.

If there are no objections I'll commit v2 soon (after Stack confirmed that he in fact +1'ed v2 :) )

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438354#comment-13438354 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

With these two changes Bytes.toInt is no longer the top 40 (or so) methods. (used to be in the top ten)
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6621:
---------------------------------

    Priority: Minor  (was: Major)
    
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6621:
---------------------------------

    Attachment: 6621-0.96-v2.txt

Combined patch to make use of the information the HScannerV2 already has.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438474#comment-13438474 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

I will admit that the improvement is hard to separate from the GC noise. According to the profiler it should save a about 5%.

When I average out the scan times on the client and let it run for a while I do see about 2% improvement... Hmm. (note this for v4, with no additional new caching in KeyValue)

I will also try with a filter that filters everything to eliminate variance in the driving client.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439500#comment-13439500 ] 

Hudson commented on HBASE-6621:
-------------------------------

Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/])
    HBASE-6621 Reduce calls to Bytes.toInt (Revision 1375665)

     Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438793#comment-13438793 ] 

stack commented on HBASE-6621:
------------------------------

I +1'd v2 Lars.  v4 looks good to me.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438787#comment-13438787 ] 

Todd Lipcon commented on HBASE-6621:
------------------------------------

Oops, yea, I missed the fact that the caching was removed again in the later patches. The change makes sense to me.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439085#comment-13439085 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

Here's another observation. In ScanQueryMatcher.match we have this:
{code}
    byte [] bytes = kv.getBuffer();
    int offset = kv.getOffset();
    int initialOffset = offset;

    int keyLength = Bytes.toInt(bytes, offset, Bytes.SIZEOF_INT);
{code}

At this point the passed kv already has its keyLength cached. So we can use {code}int keyLength = kv.getKeyLength();{code} instead a save a few more cycles.
This has measurable effect with many columns (~3%).

A simple 1-line change. Any opposition doing this here as well, or should I open a new issue.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438420#comment-13438420 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

Ahh, will correct the spelling. The caution is that if you pass in the wrong length of the key you'll get unpredictable results.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438450#comment-13438450 ] 

Hadoop QA commented on HBASE-6621:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12541708/6621-0.96-v4.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    -1 javac.  The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings).

    -1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
     

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2635//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2635//console

This message is automatically generated.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6621:
---------------------------------

    Fix Version/s: 0.94.2
                   0.96.0
    
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.96.0, 0.94.2
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438944#comment-13438944 ] 

Hudson commented on HBASE-6621:
-------------------------------

Integrated in HBase-TRUNK #3250 (See [https://builds.apache.org/job/HBase-TRUNK/3250/])
    HBASE-6621 Reduce calls to Bytes.toInt (Revision 1375663)

     Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438430#comment-13438430 ] 

Todd Lipcon commented on HBASE-6621:
------------------------------------

Do you have benchmark results showing an improvement in actual scan speed?

When I looked into scan performance with oprofile a few months back, I found the same as you -- that a lot of time was spent in these calls. But when I also added cache miss counters to the profile, I found the reason was cache misses, not the actual CPU usage of the function. So caching them would just shift around the cache miss to the next access of the cache line elsewhere, without actually improving total performance.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438364#comment-13438364 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

One more observation. KeyValue.getType() is called a lot. Caching that byte provides a nice benefit. In my test is tickles another few percent CPU away.
ScannerV2.{next|getKeyValue|readKeyValueLen} now represent a large percentage of the overall CPU (about 60%), which is the way we want it.

I'll add that to the patch.

@Stack: Did you look at the initial patch or V2?

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6621:
---------------------------------

    Attachment: 6621-0.96-v3.txt

Adds the caching of type.
(If that should be a different change, we can just integrate v2 here)
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438491#comment-13438491 ] 

Lars Hofhansl commented on HBASE-6621:
--------------------------------------

Using a qualifier filter that just filters each qualifier (to eliminate variance from the network and the driver client), I see 7.30ms vs 7.65ms per, a ~4% improvement.

This was a repeated get of a row with 200.000 columns, all of which are filtered by a QualifierFilter (i.e. all KVs are read at the server, but none are returned to the client).

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438921#comment-13438921 ] 

Hudson commented on HBASE-6621:
-------------------------------

Integrated in HBase-0.94 #411 (See [https://builds.apache.org/job/HBase-0.94/411/])
    HBASE-6621 Reduce calls to Bytes.toInt (Revision 1375665)

     Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6621:
---------------------------------

    Attachment: 6621-0.96-v4.txt

v4 is like v2 with Ted's comments addressed (the note caution in the comment was pointless, since the same would happen when the wrong length is specified).

                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438358#comment-13438358 ] 

stack commented on HBASE-6621:
------------------------------

+1 on patch.  Simple.
                
> Reduce calls to Bytes.toInt
> ---------------------------
>
>                 Key: HBASE-6621
>                 URL: https://issues.apache.org/jira/browse/HBASE-6621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6621-0.96.txt, 6621-0.96-v2.txt
>
>
> Bytes.toInt shows up quite often in a profiler run.
> It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue().
> Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that.
> In the extreme case of 10000's of columns this noticeably reduces CPU. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira