You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Zheng Hu (JIRA)" <ji...@apache.org> on 2019/01/02 04:00:00 UTC
[jira] [Comment Edited] (HBASE-21657) PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100% scan case.

    [ https://issues.apache.org/jira/browse/HBASE-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16731757#comment-16731757 ] 

Zheng Hu edited comment on HBASE-21657 at 1/2/19 3:59 AM:
----------------------------------------------------------

bq. I think this method is only called if we actually return some Cells to the client
That's right. 

bq. So I guess the assumption was that when the Cell need to ship over the network to the client anyway, that some CPU won't hurt. No longer true, I guess.
I don't think so.  because if the bottleneck was network or rpc, the estimatedSerializedSizeOf in flamegraph shouldn't cost so much, the methods related RPC should have more higher ratio. 

bq. The cells being scanned not of type ExtendedCell?
I've checked the code path and added some log.  All the cells which passed to PrivateCellUtil#estimatedSerializedSizeOf  were SizeCachedKeyValue* or ByteBufferedKeyValue (see HFileReaderImpl#getCell)... so all of them should be instanceof ExtendedCell.   The complicated condition sentences which may lead to the JVM inline  did not work.... Anyway, I'll provide a new performance report after applying patch.v1 which moved the getSerializeSize from ExtendCell to Cell for avoiding the frequent instanceof,  it's not a production patch, just for verification.



was (Author: openinx):
bq. I think this method is only called if we actually return some Cells to the client
That's right. 

bq. So I guess the assumption was that when the Cell need to ship over the network to the client anyway, that some CPU won't hurt. No longer true, I guess.
I don't think so.  because if the bottleneck was network or rpc, the estimatedSerializedSizeOf in flamegraph shouldn't cost so much, the methods related RPC should have more higher ratio. 

bq. The cells being scanned not of type ExtendedCell?
I've checked the code path and added some log.  All the cells which passed to PrivateCellUtil#estimatedSerializedSizeOf  were SizeCachedKeyValue* or ByteBufferedKeyValue (see HFileReaderImpl#getCell)... so all of them should be instanceof ExtendedCell.   The complicated condition sentences which lead to the JVM inline  did not work.... Anyway, I'll provide a new performance report after applying patch.v1 which moved the getSerializeSize from ExtendCell to Cell for avoiding the frequent instanceof,  it's not a production patch, just for verification.


> PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100% scan case.
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-21657
>                 URL: https://issues.apache.org/jira/browse/HBASE-21657
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
>         Attachments: HBASE-21657.v1.patch, hbase20-ssd-100-scan-traces.svg
>
>
> We are evaluating the performance of branch-2, and find that the throughput of scan in SSD cluster is almost the same as HDD cluster. so I made a FlameGraph on RS, and found that the PrivateCellUtil#estimatedSerializedSizeOf cost about 29% cpu, Obviously, it has been the bottleneck in 100% scan case.
> See the [^hbase20-ssd-100-scan-traces.svg]
> BTW, in our XiaoMi branch, we introduce a HRegion#updateReadRequestsByCapacityUnitPerSecond to sum up the size of cells (for metric monitor), so it seems the performance loss was amplified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)