You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Bryan Beaudreault (Jira)" <ji...@apache.org> on 2023/01/31 18:07:00 UTC

[jira] [Updated] (HBASE-27570) Unify tracking of block IO across all read request types

     [ https://issues.apache.org/jira/browse/HBASE-27570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Beaudreault updated HBASE-27570:
--------------------------------------
    Summary: Unify tracking of block IO across all read request types  (was: Unify tracking of block IO across all request types)

> Unify tracking of block IO across all read request types
> --------------------------------------------------------
>
>                 Key: HBASE-27570
>                 URL: https://issues.apache.org/jira/browse/HBASE-27570
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Bryan Beaudreault
>            Priority: Major
>
> Currently Get and Multiget call a method in RSRpcServices [addSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1303-L1335], which attempts to estimate block IO by pulling the capacity of the underlying cell buffer when that buffer changes. This is just an estimate and can be inaccurate in certain circumstances, such as when the ordering of gets in a multiget causes the same buffer to be counted twice.
> As of HBASE-27558, ScannerContext now tracks the block IO for each read request. Gets and Multigets make use of a default scanner context which only enforces batch size and isn't exposed to RSRpcServices. We can make a small change to create a ScannerContext with LimitScope.ROW, and use the ScannerContext.getBlockSize() to get the exact block IO consumed by a query.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)