You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by "Dayue Gao (JIRA)" <ji...@apache.org> on 2016/02/27 09:29:18 UTC

[jira] [Created] (KYLIN-1455) HBase ScanMetrics are not properly logged in query log

Dayue Gao created KYLIN-1455:
--------------------------------

             Summary: HBase ScanMetrics are not properly logged in query log
                 Key: KYLIN-1455
                 URL: https://issues.apache.org/jira/browse/KYLIN-1455
             Project: Kylin
          Issue Type: Improvement
          Components: Storage - HBase
    Affects Versions: v1.2
            Reporter: Dayue Gao
            Assignee: Dayue Gao
            Priority: Minor


HBase's ScanMetrics provide users valuable information when troubleshooting query performance issues. But I found it was not properly logged, sometimes missing from the log, sometimes duplicated.

Below is an example of duplicated scan log, this is due to {{CubeSegmentTupleIterator#closeScanner()}} method is invoked two times, first in hasNext(), second in close().

{noformat}
[http-bio-8080-exec-8]:[2016-02-26 17:31:50,227][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:146)] - Scan {"loadColumnFamiliesOnDemand":null,"filter":"FuzzyRowFilter{fuzzyKeysData={\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00:\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x00\\x00\\x00\\xFF\\x00\\x00\\xFF\\x00}}, ","startRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00","stopRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x08\\x8F\\xFF\\x17\\xFF\\xFF\\x17\\xFF\\x00","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":5242880,"families":{"F1":["M"]},"caching":1024,"maxVersions":1,"timeRange":[0,9223372036854775807]}
[http-bio-8080-exec-8]:[2016-02-26 17:31:50,229][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:150)] - HBase Metrics: count=17357, ms=3194, bytes=905594, remote_bytes=905594, regions=1, not_serving_region=0, rpc=19, rpc_retries=0, remote_rpc=19, remote_rpc_retries=0
[http-bio-8080-exec-8]:[2016-02-26 17:32:58,016][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:146)] - Scan {"loadColumnFamiliesOnDemand":null,"filter":"FuzzyRowFilter{fuzzyKeysData={\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00:\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x00\\x00\\x00\\xFF\\x00\\x00\\xFF\\x00}}, ","startRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00","stopRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x08\\x8F\\xFF\\x17\\xFF\\xFF\\x17\\xFF\\x00","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":5242880,"families":{"F1":["M"]},"caching":1024,"maxVersions":1,"timeRange":[0,9223372036854775807]}
[http-bio-8080-exec-8]:[2016-02-26 17:33:04,443][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:150)] - HBase Metrics: count=17357, ms=3194, bytes=905594, remote_bytes=905594, regions=1, not_serving_region=0, rpc=19, rpc_retries=0, remote_rpc=19, remote_rpc_retries=0
{noformat}

And sometimes ScanMetrics is missing from the log, showed below. I think this is due to {{CubeSegmentTupleIterator#closeScanner()}} trying to get ScanMetrics before close the current ResultScanner. After looking into HBase client source, I found that ScanMetrics will not be written out until the scanner is closed or exhausted (no cache entries). So it'd be better to get ScanMetrics after closing the scanner.

{noformat}
[http-bio-8080-exec-2]:[2016-02-26 17:18:43,928][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:146)] - Scan {"loadColumnFamiliesOnDemand":null,"filter":"FuzzyRowFilter{fuzzyKeysData={\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00:\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x00\\x00\\x00\\xFF\\x00\\x00\\xFF\\x00}}, ","startRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x01\\x1C\\x04\\x0Cx\\x03\\x08Y","stopRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x08\\x8F\\xFF\\x17\\xFF\\xFF\\x17\\xFF\\x00","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":5242880,"families":{"F1":["M"]},"caching":1024,"maxVersions":1,"timeRange":[0,9223372036854775807]}
[http-bio-8080-exec-2]:[2016-02-26 17:19:38,228][INFO][org.apache.kylin.rest.service.QueryService.logQuery(QueryService.java:242)] -
==========================[QUERY]===============================
{noformat}

This should be easy to fix, I will submit a patch for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)