You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Kyle Purtell (Jira)" <ji...@apache.org> on 2022/07/26 19:10:00 UTC

[jira] [Resolved] (HBASE-27155) Improvements to low level scanner tracing

     [ https://issues.apache.org/jira/browse/HBASE-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Kyle Purtell resolved HBASE-27155.
-----------------------------------------
    Resolution: Won't Fix

> Improvements to low level scanner tracing
> -----------------------------------------
>
>                 Key: HBASE-27155
>                 URL: https://issues.apache.org/jira/browse/HBASE-27155
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners, tracing
>            Reporter: Andrew Kyle Purtell
>            Priority: Major
>
> Related to HBASE-27153, consider tracer semantic attributes for low level scanner details.
> Consider https://issues.apache.org/jira/secure/attachment/13006571/W-7665966-Instrument-low-level-scan-details-branch-2.2.patch (from HBASE-24637).  This was used to collect detailed metrics of the decisions of ScanQueryMatcher and related classes.
> {noformat}
> metrics: [ "block_read_keys": 477 "block_read_ns": 3427040
>  "block_reads": 13 "block_seek_ns": 1606370 "block_seeks": 169
>  "block_unpack_ns": 10256 "block_unpacks": 13 "cells_matched": 165
>  "cells_matched__hbase:meta,,1.1588230740__info": 165
>  "column_hint_include": 148 "memstore_next": 72
>  "memstore_next_ns": 136671 "memstore_seek": 2
>  "memstore_seek_ns": 631629 "reseeks": 36 "sqm_hint_done": 17
>  "sqm_hint_include": 74 "sqm_hint_seek_next_col": 74
>  "store_next": 276
>  "store_next__1c930a35ff8041368a05817adbdcce97": 40
>  "store_next__2644194fdf794815abdc940c183dab88": 40
>  "store_next__32ce31753fb244668f788fb94ab02dff": 40
>  "store_next__61c8423b9d8846c99a61cd2996b5b621": 116
>  "store_next__f4f7878c9fcf40d9902416d5c7a4097a": 40
>  "store_next_ns": 1891634
>  "store_next_ns__1c930a35ff8041368a05817adbdcce97": 269383
>  "store_next_ns__2644194fdf794815abdc940c183dab88": 299936
>  "store_next_ns__32ce31753fb244668f788fb94ab02dff": 288594
>  "store_next_ns__61c8423b9d8846c99a61cd2996b5b621": 594313
>  "store_next_ns__f4f7878c9fcf40d9902416d5c7a4097a": 439408
>  "store_reseek": 164
>  "store_reseek__1c930a35ff8041368a05817adbdcce97": 32
>  "store_reseek__2644194fdf794815abdc940c183dab88": 32
>  "store_reseek__32ce31753fb244668f788fb94ab02dff": 32
>  "store_reseek__61c8423b9d8846c99a61cd2996b5b621": 36
>  "store_reseek__f4f7878c9fcf40d9902416d5c7a4097a": 32
>  "store_reseek_ns": 2969978
>  "store_reseek_ns__1c930a35ff8041368a05817adbdcce97": 359489
>  "store_reseek_ns__2644194fdf794815abdc940c183dab88": 595115
>  "store_reseek_ns__32ce31753fb244668f788fb94ab02dff": 474642
>  "store_reseek_ns__61c8423b9d8846c99a61cd2996b5b621": 1013188
>  "store_reseek_ns__f4f7878c9fcf40d9902416d5c7a4097a": 527544
>  "store_seek": 5
>  "store_seek__1c930a35ff8041368a05817adbdcce97": 1
>  "store_seek__2644194fdf794815abdc940c183dab88": 1
>  "store_seek__32ce31753fb244668f788fb94ab02dff": 1
>  "store_seek__61c8423b9d8846c99a61cd2996b5b621": 1
>  "store_seek__f4f7878c9fcf40d9902416d5c7a4097a": 1
>  "store_seek_ns": 8862786
>  "store_seek_ns__1c930a35ff8041368a05817adbdcce97": 830421
>  "store_seek_ns__2644194fdf794815abdc940c183dab88": 585899
>  "store_seek_ns__32ce31753fb244668f788fb94ab02dff": 483605
>  "store_seek_ns__61c8423b9d8846c99a61cd2996b5b621": 5958072
>  "store_seek_ns__f4f7878c9fcf40d9902416d5c7a4097a": 1004789
>  "versions_hint_include": 74 "versions_hint_seek_next_col": 74 ]
> {noformat}
> We can see the differences between seek time and reseek time and we get the counts for same, so we can analyze if SQM is making optimal choices (or less optimal choices) or not, or if behavior has changed; and we can identify particular store file(s) that might be outliers for some reason when hunting for sources of regression. We get the time required to unpack blocks (on average). We get a count of hints supplied by base SQM functionality or filters. We get the relative contributions of query processing time separately from memstore and store files. 
> Perhaps this can be done conditionally for scans that are selected for tracing. Of course there is a performance concern, so it must be done such that the overheads really are conditional on if the path is being actively traced, and measured carefully to decide if it should be committed or not. 
> WDYT [~ndimiduk] [~zhangduo]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)