You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Zheng Hu (JIRA)" <ji...@apache.org> on 2019/05/31 08:27:00 UTC

[jira] [Updated] (HBASE-22483) It's better to use 65KB as the default buffer size in ByteBuffAllocator

     [ https://issues.apache.org/jira/browse/HBASE-22483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Hu updated HBASE-22483:
-----------------------------
    Summary: It's better to use 65KB as the default buffer size in ByteBuffAllocator  (was: Maybe it's better to use 65KB as the default buffer size in ByteBuffAllocator)

> It's better to use 65KB as the default buffer size in ByteBuffAllocator
> -----------------------------------------------------------------------
>
>                 Key: HBASE-22483
>                 URL: https://issues.apache.org/jira/browse/HBASE-22483
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>         Attachments: 121240.stack, BucketCacheWriter-is-busy.png, checksum-stacktrace.png, with-buffer-size-64KB.png, with-buffer-size-65KB.png
>
>
> There're some reason why it's better to choose 65KB as the default buffer size: 
> 1. Almost all of the data block have a block size: 64KB + delta, whose delta is very small, depends on the size of lastKeyValue. If we use the default hbase.ipc.server.allocator.buffer.size=64KB, then each block will be allocated as a MultiByteBuff: one 64KB DirectByteBuffer and delta bytes HeapByteBuffer, the HeapByteBuffer will increase the GC pressure. Ideally, we should let the data block to be allocated as a SingleByteBuff, it has simpler data structure, faster access speed, less heap usage... 
> 2. In my benchmark, I found some checksum stack traces . (see [checksum-stacktrace.png |https://issues.apache.org/jira/secure/attachment/12969905/checksum-stacktrace.png]) 
>  Since the block are MultiByteBuff, so we have to calculate the checksum by an temp heap copying ( see HBASE-21917), while if we're a SingleByteBuff, we can speed the checksum by calling the hadoop' checksum in native lib, it's more faster.
> 3. Seems the BucketCacheWriters were always busy because of the higher cost of copying from MultiByteBuff to DirectByteBuffer.  For SingleByteBuff, we can just use the unsafe array copying while for MultiByteBuff we have to copy byte one by one.
> Anyway, I will give a benchmark for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)