You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by "Dayue Gao (JIRA)" <ji...@apache.org> on 2016/02/27 09:29:18 UTC
[jira] [Created] (KYLIN-1455) HBase ScanMetrics are not properly
logged in query log
Dayue Gao created KYLIN-1455:
--------------------------------
Summary: HBase ScanMetrics are not properly logged in query log
Key: KYLIN-1455
URL: https://issues.apache.org/jira/browse/KYLIN-1455
Project: Kylin
Issue Type: Improvement
Components: Storage - HBase
Affects Versions: v1.2
Reporter: Dayue Gao
Assignee: Dayue Gao
Priority: Minor
HBase's ScanMetrics provide users valuable information when troubleshooting query performance issues. But I found it was not properly logged, sometimes missing from the log, sometimes duplicated.
Below is an example of duplicated scan log, this is due to {{CubeSegmentTupleIterator#closeScanner()}} method is invoked two times, first in hasNext(), second in close().
{noformat}
[http-bio-8080-exec-8]:[2016-02-26 17:31:50,227][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:146)] - Scan {"loadColumnFamiliesOnDemand":null,"filter":"FuzzyRowFilter{fuzzyKeysData={\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00:\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x00\\x00\\x00\\xFF\\x00\\x00\\xFF\\x00}}, ","startRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00","stopRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x08\\x8F\\xFF\\x17\\xFF\\xFF\\x17\\xFF\\x00","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":5242880,"families":{"F1":["M"]},"caching":1024,"maxVersions":1,"timeRange":[0,9223372036854775807]}
[http-bio-8080-exec-8]:[2016-02-26 17:31:50,229][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:150)] - HBase Metrics: count=17357, ms=3194, bytes=905594, remote_bytes=905594, regions=1, not_serving_region=0, rpc=19, rpc_retries=0, remote_rpc=19, remote_rpc_retries=0
[http-bio-8080-exec-8]:[2016-02-26 17:32:58,016][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:146)] - Scan {"loadColumnFamiliesOnDemand":null,"filter":"FuzzyRowFilter{fuzzyKeysData={\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00:\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x00\\x00\\x00\\xFF\\x00\\x00\\xFF\\x00}}, ","startRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00","stopRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x08\\x8F\\xFF\\x17\\xFF\\xFF\\x17\\xFF\\x00","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":5242880,"families":{"F1":["M"]},"caching":1024,"maxVersions":1,"timeRange":[0,9223372036854775807]}
[http-bio-8080-exec-8]:[2016-02-26 17:33:04,443][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:150)] - HBase Metrics: count=17357, ms=3194, bytes=905594, remote_bytes=905594, regions=1, not_serving_region=0, rpc=19, rpc_retries=0, remote_rpc=19, remote_rpc_retries=0
{noformat}
And sometimes ScanMetrics is missing from the log, showed below. I think this is due to {{CubeSegmentTupleIterator#closeScanner()}} trying to get ScanMetrics before close the current ResultScanner. After looking into HBase client source, I found that ScanMetrics will not be written out until the scanner is closed or exhausted (no cache entries). So it'd be better to get ScanMetrics after closing the scanner.
{noformat}
[http-bio-8080-exec-2]:[2016-02-26 17:18:43,928][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:146)] - Scan {"loadColumnFamiliesOnDemand":null,"filter":"FuzzyRowFilter{fuzzyKeysData={\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00:\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x00\\x00\\x00\\xFF\\x00\\x00\\xFF\\x00}}, ","startRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x01\\x1C\\x04\\x0Cx\\x03\\x08Y","stopRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x08\\x8F\\xFF\\x17\\xFF\\xFF\\x17\\xFF\\x00","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":5242880,"families":{"F1":["M"]},"caching":1024,"maxVersions":1,"timeRange":[0,9223372036854775807]}
[http-bio-8080-exec-2]:[2016-02-26 17:19:38,228][INFO][org.apache.kylin.rest.service.QueryService.logQuery(QueryService.java:242)] -
==========================[QUERY]===============================
{noformat}
This should be easy to fix, I will submit a patch for this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)