You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Anoop Sam John (JIRA)" <ji...@apache.org> on 2015/08/04 19:55:04 UTC

[jira] [Commented] (HBASE-14186) Read mvcc vlong optimization

    [ https://issues.apache.org/jira/browse/HBASE-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654042#comment-14654042 ] 

Anoop Sam John commented on HBASE-14186:
----------------------------------------

On JMH benchmark, the difference is 
{code}
Benchmark                Mode  Cnt          Score         Error  Units
MBBTest.readMvccNew     thrpt    6  122467888.294 ± 2143187.504  ops/s
MBBTest.readMvccOldway  thrpt    6   75684230.226 ± 9943572.564  ops/s
{code}

Also done a PE test where all data in offheap cache. After noticing the _readMvccVersion () being the hot method, done this optimization.  This makes the avg latency of the thread run by ~15%

> Read mvcc vlong optimization
> ----------------------------
>
>                 Key: HBASE-14186
>                 URL: https://issues.apache.org/jira/browse/HBASE-14186
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Scanners
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>
>         Attachments: HBASE-14186.patch
>
>
> {code}
> for (int idx = 0; idx < remaining; idx++) {
>   byte b = blockBuffer.getByteAfterPosition(offsetFromPos + idx);
>   i = i << 8;
>   i = i | (b & 0xFF);
> }
> {code}
> Doing the read as in case of BIG_ENDIAN.
> After HBASE-12600, we tend to keep the mvcc and so byte by byte read looks eating up lot of CPU time. (In my test HFileReaderImpl#_readMvccVersion comes on top in terms of hot methods). We can optimize here by reading 4 or 2 bytes in one shot when the length of the vlong is more than 4 bytes. We will in turn use UnsafeAccess methods which handles ENDIAN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)