You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@kylin.apache.org by Andras Nagy <an...@gmail.com> on 2019/11/28 13:05:49 UTC

Need advice and best practices for Kylin query tuning (not cube tuning)

Dear All,

We are troubleshooting slow queries in our Kylin deployment, and we suspect
that the issue is not with the cube definitions, but with our queries. At
least we have some quite complex queries with a lot of range checks on
dimension values, and we have observed different response times by changing
the queries to alternative, but functionally equivalent ones.

Although it's hard to come to conclusions because we see a large variance
in query response times (for the same query in the same environment, at
roughly the same time).
We have disabled query caching in kylin.properties
(kylin.query.cache-enabled=false) to be able to have more conclusive
results on what effect certain changes have on query execution time, but we
still observe variance in query results on an environment that otherwise
has no load. Perhaps this is due to caching within HBase or within the
streaming receiver.

Do you have any guidelines, best practices, documentation on how to tune
queries for Kylin? (I'm aware of some cube tuning guidelines from the Kylin
documentation, but now I'm looking for advice specifically about query
optimization.)

Many thanks,
Andras

Re: Need advice and best practices for Kylin query tuning (not cube tuning)

Posted by ShaoFeng Shi <sh...@apache.org>.

Hi Andras,

Is this specific to real-time streaming, or it is also related to the
normal batch cube? I want to narrow down the scope. Besides, how complex
the queries are? I think that may need more inputs. If you can bring some
sample queries that would help to some extend.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofengshi@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscribe@kylin.apache.org
Join Kylin dev mail group: dev-subscribe@kylin.apache.org




Andras Nagy <an...@gmail.com> 于2019年11月28日周四 下午9:06写道：

> Dear All,
>
> We are troubleshooting slow queries in our Kylin deployment, and we
> suspect that the issue is not with the cube definitions, but with our
> queries. At least we have some quite complex queries with a lot of range
> checks on dimension values, and we have observed different response times
> by changing the queries to alternative, but functionally equivalent ones.
>
> Although it's hard to come to conclusions because we see a large variance
> in query response times (for the same query in the same environment, at
> roughly the same time).
> We have disabled query caching in kylin.properties
> (kylin.query.cache-enabled=false) to be able to have more conclusive
> results on what effect certain changes have on query execution time, but we
> still observe variance in query results on an environment that otherwise
> has no load. Perhaps this is due to caching within HBase or within the
> streaming receiver.
>
> Do you have any guidelines, best practices, documentation on how to tune
> queries for Kylin? (I'm aware of some cube tuning guidelines from the Kylin
> documentation, but now I'm looking for advice specifically about query
> optimization.)
>
> Many thanks,
> Andras
>

Re: Need advice and best practices for Kylin query tuning (not cube tuning)

Posted by Andras Istvan Nagy <an...@gmail.com>.

Hi ShaoFeng,
Great, thank you for the link!
Best regards,
Andras

> On 2019. Dec 7., at 11:52, ShaoFeng Shi <sh...@apache.org> wrote:
> 
> This is what we called "read-write separated" deployment. Actually many Kylin users are running in this mode: deploy a dedicated HBase cluster only for query, and with another Hadoop cluster for cube building. You can refer to this blog:
> 
> https://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/
> 
> Best regards,
> 
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Email: shaofengshi@apache.org
> 
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscribe@kylin.apache.org
> Join Kylin dev mail group: dev-subscribe@kylin.apache.org
> 
> 
> 
> 
> Andras Nagy <an...@gmail.com> 于2019年12月7日周六 上午12:42写道：
>> Hi All,
>> 
>> Thanks a lot, guys! With your generous help Xiaoxiang we have been able to track down the problem, it was mostly related to too large segments with not enough regions, so low level of parallelism in the HBase side.
>> Now the p95 of query latency is acceptable for the time being, although we will still look into improving it.
>> 
>> However, there is still a small number of queries that take excessively long to execute. Upon analyzing these, it turned out that there is a regular pattern to when these occur, and they coincide with the cubing job executions on the same cluster (hourly stream cube building, daily cube refresh to handle very late events).
>> 
>> As a consequence, I'd like to move all of these yarn-based workloads (Hive, MapReduce) to a separate EMR cluster (our environment is in AWS) and keep another EMR only for HBase. This would also help with better resource usage of these clusters, as we wouldn't need a static split of node memory between HBase and Yarn.
>> 
>> Does anyone have experience with this setup? Do you see any potential issues with it? Any feedback on this is welcome.
>> 
>> Thanks a lot,
>> Andras
>> 
>> 
>>> On Mon, Dec 2, 2019 at 3:56 AM Xiaoxiang Yu <xi...@kyligence.io> wrote:
>>> Hi andras,
>>> 
>>> I would like to share what I find.
>>> 
>>> There are several step in query execuation, you can find some log which can indicate the start an end of these steps, that will help to you find which step cost a lot time.
>>> 
>>>  
>>> 
>>> Step 1. Check SQL ACL and grammar; and then use Calicate to parse SQL into AST and finally into an execution plan.
>>> 
>>>   
>>> 
>>>    Start of step 1:
>>> 
>>> service.QueryService:414 : The original query:
>>> 
>>>        End of step 1:
>>> 
>>>          enumerator.OLAPEnumerator:105 : query storage...
>>> 
>>> Step 2. Send RPC request to region server or streaming receiver.
>>> 
>>>  
>>> 
>>>    Start of send RPC request to Region Server.
>>> 
>>>           2019-11-28 20:01:53,273 INFO  [Query 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] v2.CubeHBaseEndpointRPC:165 : The scan 30572c87 for segment UAC[20191127151000_20191128012000] is as below with 1 separate raw scans, shard part of start/end key is set to 0
>>> 
>>> 2019-11-28 20:01:53,275 INFO  [Query 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] v2.CubeHBaseRPC:288 : Visiting hbase table lacus:LACUS_AGEAMBMTM1: cuboid require post aggregation, from 89 to 127 Start: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00\x6E\x07\x0A\xFA\xAA\x00\x00\x00\x00\x00\x00\x00\x00\x00 (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00n\x07\x0A\xFA\xAA\x00\x00\x00\x00\x00\x00\x00\x00\x00) Stop:  \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\xFF\xFF\xFF\x6E\x07\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00 (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\xFF\xFF\xFFn\x07\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00) Fuzzy key counts: 1. Fuzzy keys : \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00\x6E\x07\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x01\x01\x00\x00\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01;
>>> 
>>>  
>>> 
>>>        End of send RPC request to Region Server.
>>> 
>>> 2019-11-28 20:02:10,751 INFO  [kylin-coproc--pool2-t4] v2.CubeHBaseEndpointRPC:343 : <sub-thread for Query 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a GTScanRequest 30572c87>Endpoint RPC returned from HTable lacus:LACUS_AGEAMBMTM1 Shard \x6C\x61\x63\x75\x73\x3A\x4C\x41\x43\x55\x53\x5F\x41\x47\x45\x41\x4D\x42\x4D\x54\x4D\x31\x2C\x2C\x31\x35\x37\x34\x39\x30\x36\x31\x35\x35\x35\x39\x33\x2E\x39\x64\x39\x30\x39\x36\x31\x64\x66\x61\x38\x37\x31\x37\x35\x38\x36\x37\x33\x37\x65\x33\x36\x30\x36\x31\x35\x34\x34\x36\x36\x33\x2E on host: cdh-worker-2.Total scanned row: 770235. Total scanned bytes: 47587866. Total filtered row: 0. Total aggred row: 764215. Time elapsed in EP: 17113(ms). Server CPU usage: 0.24357142857142858, server physical mem left: 4.23337984E9, server swap mem left:0.0.Etc message: start latency: 397@2,agg done@17112,compress done@17113,server stats done@17113,
>>> 
>>>  
>>> 
>>>       You will find statistics information for each hbase RPC, please check which request take a lot of time or scan too much row. You may check the following aspects:
>>> 
>>> 1.     Did your sql query hit the right/best match cuboid?
>>> 
>>> 2.     Could you try to change the order of Rowkey according to your SQL query to reduce the scan row?  
>>> 
>>> 3.     Was some region too large so you may try to cut some segment into more small ones(improve the degree of parallelism)?
>>> 
>>>  
>>> 
>>> Step 3. Do query server side aggregation and decode.
>>> 
>>>  
>>> 
>>>       From the last rpc statistics log to final query statistics info .
>>> 
>>>        2019-11-28 20:02:10,966 INFO  [Query 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] service.QueryService:491 : Stats of SQL response: isException: false, duration: 21170, total scan count 1565099
>>> 
>>>  
>>> 
>>> ----------------
>>> 
>>> Best wishes,
>>> 
>>> Xiaoxiang Yu
>>> 
>>>  
>>> 
>>>  
>>> 
>>> 发件人: Andras Nagy <an...@gmail.com>
>>> 答复: "user@kylin.apache.org" <us...@kylin.apache.org>
>>> 日期: 2019年11月28日 星期四 21:06
>>> 收件人: "user@kylin.apache.org" <us...@kylin.apache.org>
>>> 主题: Need advice and best practices for Kylin query tuning (not cube tuning)
>>> 
>>>  
>>> 
>>> Dear All,
>>> 
>>>  
>>> 
>>> We are troubleshooting slow queries in our Kylin deployment, and we suspect that the issue is not with the cube definitions, but with our queries. At least we have some quite complex queries with a lot of range checks on dimension values, and we have observed different response times by changing the queries to alternative, but functionally equivalent ones. 
>>> 
>>>  
>>> 
>>> Although it's hard to come to conclusions because we see a large variance in query response times (for the same query in the same environment, at roughly the same time). 
>>> 
>>> We have disabled query caching in kylin.properties (kylin.query.cache-enabled=false) to be able to have more conclusive results on what effect certain changes have on query execution time, but we still observe variance in query results on an environment that otherwise has no load. Perhaps this is due to caching within HBase or within the streaming receiver.
>>> 
>>>  
>>> 
>>> Do you have any guidelines, best practices, documentation on how to tune queries for Kylin? (I'm aware of some cube tuning guidelines from the Kylin documentation, but now I'm looking for advice specifically about query optimization.)
>>> 
>>>  
>>> 
>>> Many thanks,
>>> 
>>> Andras

Re: Need advice and best practices for Kylin query tuning (not cube tuning)

Posted by ShaoFeng Shi <sh...@apache.org>.

This is what we called "read-write separated" deployment. Actually many
Kylin users are running in this mode: deploy a dedicated HBase cluster only
for query, and with another Hadoop cluster for cube building. You can refer
to this blog:

https://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofengshi@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscribe@kylin.apache.org
Join Kylin dev mail group: dev-subscribe@kylin.apache.org




Andras Nagy <an...@gmail.com> 于2019年12月7日周六 上午12:42写道：

> Hi All,
>
> Thanks a lot, guys! With your generous help Xiaoxiang we have been able to
> track down the problem, it was mostly related to too large segments with
> not enough regions, so low level of parallelism in the HBase side.
> Now the p95 of query latency is acceptable for the time being, although we
> will still look into improving it.
>
> However, there is still a small number of queries that take excessively
> long to execute. Upon analyzing these, it turned out that there is a
> regular pattern to when these occur, and they coincide with the cubing job
> executions on the same cluster (hourly stream cube building, daily cube
> refresh to handle very late events).
>
> As a consequence, I'd like to move all of these yarn-based workloads
> (Hive, MapReduce) to a separate EMR cluster (our environment is in AWS) and
> keep another EMR only for HBase. This would also help with better resource
> usage of these clusters, as we wouldn't need a static split of node memory
> between HBase and Yarn.
>
> Does anyone have experience with this setup? Do you see any potential
> issues with it? Any feedback on this is welcome.
>
> Thanks a lot,
> Andras
>
>
> On Mon, Dec 2, 2019 at 3:56 AM Xiaoxiang Yu <xi...@kyligence.io>
> wrote:
>
>> Hi andras,
>>
>> I would like to share what I find.
>>
>> There are several step in query execuation, you can find some log which
>> can indicate the start an end of these steps, that will help to you find
>> which step cost a lot time.
>>
>>
>>
>> *Step 1. Check SQL ACL and grammar; and then use Calicate to parse SQL
>> into AST and finally into an execution plan.*
>>
>>
>>
>>    *Start of step 1:*
>>
>> service.QueryService:414 : The original query:
>>
>>        End of step 1:
>>
>>          enumerator.OLAPEnumerator:105 : query storage...
>>
>> *Step 2. Send RPC request to region server or streaming receiver.*
>>
>>
>>
>>    *Start of send RPC request to Region Server.*
>>
>>           2019-11-28 20:01:53,273 INFO  [Query
>> 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] v2.CubeHBaseEndpointRPC:165 : The
>> scan *30572c87* for segment UAC[20191127151000_20191128012000] is as
>> below with 1 separate raw scans, shard part of start/end key is set to 0
>>
>> 2019-11-28 20:01:53,275 INFO  [Query
>> 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] v2.CubeHBaseRPC:288 : Visiting
>> hbase table *lacus:LACUS_AGEAMBMTM1*: cuboid require post aggregation,
>> from 89 to 127 Start:
>> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00\x6E\x07\x0A\xFA\xAA\x00\x00\x00\x00\x00\x00\x00\x00\x00
>> (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00n\x07\x0A\xFA\xAA\x00\x00\x00\x00\x00\x00\x00\x00\x00)
>> Stop:
>> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\xFF\xFF\xFF\x6E\x07\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00
>> (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\xFF\xFF\xFFn\x07\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00)
>> Fuzzy key counts: 1. Fuzzy keys :
>> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00\x6E\x07\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>> \x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x01\x01\x00\x00\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01;
>>
>>
>>
>>        *End of send RPC request to Region Server.*
>>
>> 2019-11-28 20:02:10,751 INFO  [kylin-coproc--pool2-t4]
>> v2.CubeHBaseEndpointRPC:343 : <sub-thread for Query
>> 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a GTScanRequest *30572c87*>Endpoint
>> RPC returned from HTable lacus:LACUS_AGEAMBMTM1 Shard
>> \x6C\x61\x63\x75\x73\x3A\x4C\x41\x43\x55\x53\x5F\x41\x47\x45\x41\x4D\x42\x4D\x54\x4D\x31\x2C\x2C\x31\x35\x37\x34\x39\x30\x36\x31\x35\x35\x35\x39\x33\x2E\x39\x64\x39\x30\x39\x36\x31\x64\x66\x61\x38\x37\x31\x37\x35\x38\x36\x37\x33\x37\x65\x33\x36\x30\x36\x31\x35\x34\x34\x36\x36\x33\x2E
>> on host: cdh-worker-2.*Total scanned row: 770235*. Total scanned bytes:
>> 47587866. Total filtered row: 0. Total aggred row: 764215. *Time elapsed
>> in EP: 17113(ms)*. Server CPU usage: 0.24357142857142858, server
>> physical mem left: 4.23337984E9, server swap mem left:0.0.Etc message:
>> start latency: 397@2,agg done@17112,compress done@17113,server stats
>> done@17113,
>>
>>
>>
>>       You will find statistics information for each hbase RPC, please
>> check which request take a lot of time or scan too much row. You may check
>> the following aspects:
>>
>> 1.     Did your sql query hit the right/best match cuboid?
>>
>> 2.     Could you try to change the order of Rowkey according to your SQL
>> query to reduce the scan row?
>>
>> 3.     Was some region too large so you may try to cut some segment into
>> more small ones(improve the degree of parallelism)?
>>
>>
>>
>> *Step 3. Do query server side aggregation and decode.*
>>
>>
>>
>>       From the last rpc statistics log to final query statistics info .
>>
>>        2019-11-28 20:02:10,966 INFO  [Query
>> 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] service.QueryService:491 : Stats
>> of SQL response: isException: false, duration: 21170, total scan count
>> 1565099
>>
>>
>>
>> ----------------
>>
>> Best wishes,
>>
>> Xiaoxiang Yu
>>
>>
>>
>>
>>
>> *发件人**: *Andras Nagy <an...@gmail.com>
>> *答复**: *"user@kylin.apache.org" <us...@kylin.apache.org>
>> *日期**: *2019年11月28日 星期四 21:06
>> *收件人**: *"user@kylin.apache.org" <us...@kylin.apache.org>
>> *主题**: *Need advice and best practices for Kylin query tuning (not cube
>> tuning)
>>
>>
>>
>> Dear All,
>>
>>
>>
>> We are troubleshooting slow queries in our Kylin deployment, and we
>> suspect that the issue is not with the cube definitions, but with our
>> queries. At least we have some quite complex queries with a lot of range
>> checks on dimension values, and we have observed different response times
>> by changing the queries to alternative, but functionally equivalent ones.
>>
>>
>>
>> Although it's hard to come to conclusions because we see a large variance
>> in query response times (for the same query in the same environment, at
>> roughly the same time).
>>
>> We have disabled query caching in kylin.properties
>> (kylin.query.cache-enabled=false) to be able to have more conclusive
>> results on what effect certain changes have on query execution time, but we
>> still observe variance in query results on an environment that otherwise
>> has no load. Perhaps this is due to caching within HBase or within the
>> streaming receiver.
>>
>>
>>
>> Do you have any guidelines, best practices, documentation on how to tune
>> queries for Kylin? (I'm aware of some cube tuning guidelines from the Kylin
>> documentation, but now I'm looking for advice specifically about query
>> optimization.)
>>
>>
>>
>> Many thanks,
>>
>> Andras
>>
>

Re: Need advice and best practices for Kylin query tuning (not cube tuning)

Posted by Andras Nagy <an...@gmail.com>.

Hi All,

Thanks a lot, guys! With your generous help Xiaoxiang we have been able to
track down the problem, it was mostly related to too large segments with
not enough regions, so low level of parallelism in the HBase side.
Now the p95 of query latency is acceptable for the time being, although we
will still look into improving it.

However, there is still a small number of queries that take excessively
long to execute. Upon analyzing these, it turned out that there is a
regular pattern to when these occur, and they coincide with the cubing job
executions on the same cluster (hourly stream cube building, daily cube
refresh to handle very late events).

As a consequence, I'd like to move all of these yarn-based workloads (Hive,
MapReduce) to a separate EMR cluster (our environment is in AWS) and keep
another EMR only for HBase. This would also help with better resource usage
of these clusters, as we wouldn't need a static split of node memory
between HBase and Yarn.

Does anyone have experience with this setup? Do you see any potential
issues with it? Any feedback on this is welcome.

Thanks a lot,
Andras


On Mon, Dec 2, 2019 at 3:56 AM Xiaoxiang Yu <xi...@kyligence.io>
wrote:

> Hi andras,
>
> I would like to share what I find.
>
> There are several step in query execuation, you can find some log which
> can indicate the start an end of these steps, that will help to you find
> which step cost a lot time.
>
>
>
> *Step 1. Check SQL ACL and grammar; and then use Calicate to parse SQL
> into AST and finally into an execution plan.*
>
>
>
>    *Start of step 1:*
>
> service.QueryService:414 : The original query:
>
>        End of step 1:
>
>          enumerator.OLAPEnumerator:105 : query storage...
>
> *Step 2. Send RPC request to region server or streaming receiver.*
>
>
>
>    *Start of send RPC request to Region Server.*
>
>           2019-11-28 20:01:53,273 INFO  [Query
> 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] v2.CubeHBaseEndpointRPC:165 : The
> scan *30572c87* for segment UAC[20191127151000_20191128012000] is as
> below with 1 separate raw scans, shard part of start/end key is set to 0
>
> 2019-11-28 20:01:53,275 INFO  [Query
> 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] v2.CubeHBaseRPC:288 : Visiting
> hbase table *lacus:LACUS_AGEAMBMTM1*: cuboid require post aggregation,
> from 89 to 127 Start:
> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00\x6E\x07\x0A\xFA\xAA\x00\x00\x00\x00\x00\x00\x00\x00\x00
> (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00n\x07\x0A\xFA\xAA\x00\x00\x00\x00\x00\x00\x00\x00\x00)
> Stop:
> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\xFF\xFF\xFF\x6E\x07\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00
> (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\xFF\xFF\xFFn\x07\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00)
> Fuzzy key counts: 1. Fuzzy keys :
> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00\x6E\x07\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
> \x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x01\x01\x00\x00\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01;
>
>
>
>        *End of send RPC request to Region Server.*
>
> 2019-11-28 20:02:10,751 INFO  [kylin-coproc--pool2-t4]
> v2.CubeHBaseEndpointRPC:343 : <sub-thread for Query
> 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a GTScanRequest *30572c87*>Endpoint
> RPC returned from HTable lacus:LACUS_AGEAMBMTM1 Shard
> \x6C\x61\x63\x75\x73\x3A\x4C\x41\x43\x55\x53\x5F\x41\x47\x45\x41\x4D\x42\x4D\x54\x4D\x31\x2C\x2C\x31\x35\x37\x34\x39\x30\x36\x31\x35\x35\x35\x39\x33\x2E\x39\x64\x39\x30\x39\x36\x31\x64\x66\x61\x38\x37\x31\x37\x35\x38\x36\x37\x33\x37\x65\x33\x36\x30\x36\x31\x35\x34\x34\x36\x36\x33\x2E
> on host: cdh-worker-2.*Total scanned row: 770235*. Total scanned bytes:
> 47587866. Total filtered row: 0. Total aggred row: 764215. *Time elapsed
> in EP: 17113(ms)*. Server CPU usage: 0.24357142857142858, server physical
> mem left: 4.23337984E9, server swap mem left:0.0.Etc message: start
> latency: 397@2,agg done@17112,compress done@17113,server stats done@17113,
>
>
>
>       You will find statistics information for each hbase RPC, please
> check which request take a lot of time or scan too much row. You may check
> the following aspects:
>
> 1.     Did your sql query hit the right/best match cuboid?
>
> 2.     Could you try to change the order of Rowkey according to your SQL
> query to reduce the scan row?
>
> 3.     Was some region too large so you may try to cut some segment into
> more small ones(improve the degree of parallelism)?
>
>
>
> *Step 3. Do query server side aggregation and decode.*
>
>
>
>       From the last rpc statistics log to final query statistics info .
>
>        2019-11-28 20:02:10,966 INFO  [Query
> 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] service.QueryService:491 : Stats
> of SQL response: isException: false, duration: 21170, total scan count
> 1565099
>
>
>
> ----------------
>
> Best wishes,
>
> Xiaoxiang Yu
>
>
>
>
>
> *发件人**: *Andras Nagy <an...@gmail.com>
> *答复**: *"user@kylin.apache.org" <us...@kylin.apache.org>
> *日期**: *2019年11月28日 星期四 21:06
> *收件人**: *"user@kylin.apache.org" <us...@kylin.apache.org>
> *主题**: *Need advice and best practices for Kylin query tuning (not cube
> tuning)
>
>
>
> Dear All,
>
>
>
> We are troubleshooting slow queries in our Kylin deployment, and we
> suspect that the issue is not with the cube definitions, but with our
> queries. At least we have some quite complex queries with a lot of range
> checks on dimension values, and we have observed different response times
> by changing the queries to alternative, but functionally equivalent ones.
>
>
>
> Although it's hard to come to conclusions because we see a large variance
> in query response times (for the same query in the same environment, at
> roughly the same time).
>
> We have disabled query caching in kylin.properties
> (kylin.query.cache-enabled=false) to be able to have more conclusive
> results on what effect certain changes have on query execution time, but we
> still observe variance in query results on an environment that otherwise
> has no load. Perhaps this is due to caching within HBase or within the
> streaming receiver.
>
>
>
> Do you have any guidelines, best practices, documentation on how to tune
> queries for Kylin? (I'm aware of some cube tuning guidelines from the Kylin
> documentation, but now I'm looking for advice specifically about query
> optimization.)
>
>
>
> Many thanks,
>
> Andras
>

Re: Need advice and best practices for Kylin query tuning (not cube tuning)

Posted by Xiaoxiang Yu <xi...@kyligence.io>.

Hi andras,
I would like to share what I find.
There are several step in query execuation, you can find some log which can indicate the start an end of these steps, that will help to you find which step cost a lot time.

Step 1. Check SQL ACL and grammar; and then use Calicate to parse SQL into AST and finally into an execution plan.

Start of step 1:
service.QueryService:414 : The original query:
End of step 1:
enumerator.OLAPEnumerator:105 : query storage...
Step 2. Send RPC request to region server or streaming receiver.

Start of send RPC request to Region Server.
2019-11-28 20:01:53,273 INFO [Query 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] v2.CubeHBaseEndpointRPC:165 : The scan 30572c87 for segment UAC[20191127151000_20191128012000] is as below with 1 separate raw scans, shard part of start/end key is set to 0
2019-11-28 20:01:53,275 INFO [Query 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] v2.CubeHBaseRPC:288 : Visiting hbase table lacus:LACUS_AGEAMBMTM1: cuboid require post aggregation, from 89 to 127 Start: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00\x6E\x07\x0A\xFA\xAA\x00\x00\x00\x00\x00\x00\x00\x00\x00 (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00n\x07\x0A\xFA\xAA\x00\x00\x00\x00\x00\x00\x00\x00\x00) Stop: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\xFF\xFF\xFF\x6E\x07\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00 (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\xFF\xFF\xFFn\x07\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00) Fuzzy key counts: 1. Fuzzy keys : \x00\x00\x00\x00\x00\x00\x00\x00\x00\x7F\x00\x00\x00\x00\x6E\x07\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x01\x01\x00\x00\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01;

End of send RPC request to Region Server.
2019-11-28 20:02:10,751 INFO [kylin-coproc--pool2-t4] v2.CubeHBaseEndpointRPC:343 : <sub-thread for Query 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a GTScanRequest 30572c87>Endpoint RPC returned from HTable lacus:LACUS_AGEAMBMTM1 Shard \x6C\x61\x63\x75\x73\x3A\x4C\x41\x43\x55\x53\x5F\x41\x47\x45\x41\x4D\x42\x4D\x54\x4D\x31\x2C\x2C\x31\x35\x37\x34\x39\x30\x36\x31\x35\x35\x35\x39\x33\x2E\x39\x64\x39\x30\x39\x36\x31\x64\x66\x61\x38\x37\x31\x37\x35\x38\x36\x37\x33\x37\x65\x33\x36\x30\x36\x31\x35\x34\x34\x36\x36\x33\x2E on host: cdh-worker-2.Total scanned row: 770235. Total scanned bytes: 47587866. Total filtered row: 0. Total aggred row: 764215. Time elapsed in EP: 17113(ms). Server CPU usage: 0.24357142857142858, server physical mem left: 4.23337984E9, server swap mem left:0.0.Etc message: start latency: 397@2,agg done@17112,compress done@17113,server stats done@17113,

You will find statistics information for each hbase RPC, please check which request take a lot of time or scan too much row. You may check the following aspects:

1. Did your sql query hit the right/best match cuboid?

2. Could you try to change the order of Rowkey according to your SQL query to reduce the scan row?

3. Was some region too large so you may try to cut some segment into more small ones(improve the degree of parallelism)?

Step 3. Do query server side aggregation and decode.

From the last rpc statistics log to final query statistics info .
2019-11-28 20:02:10,966 INFO [Query 77b0da35-a6e8-d6f9-cd50-152b4d7b0c5a-62] service.QueryService:491 : Stats of SQL response: isException: false, duration: 21170, total scan count 1565099

----------------
Best wishes,
Xiaoxiang Yu

发件人: Andras Nagy <an...@gmail.com>
答复: "user@kylin.apache.org" <us...@kylin.apache.org>
日期: 2019年11月28日 星期四 21:06
收件人: "user@kylin.apache.org" <us...@kylin.apache.org>
主题: Need advice and best practices for Kylin query tuning (not cube tuning)

Dear All,

We are troubleshooting slow queries in our Kylin deployment, and we suspect that the issue is not with the cube definitions, but with our queries. At least we have some quite complex queries with a lot of range checks on dimension values, and we have observed different response times by changing the queries to alternative, but functionally equivalent ones.

Although it's hard to come to conclusions because we see a large variance in query response times (for the same query in the same environment, at roughly the same time).
We have disabled query caching in kylin.properties (kylin.query.cache-enabled=false) to be able to have more conclusive results on what effect certain changes have on query execution time, but we still observe variance in query results on an environment that otherwise has no load. Perhaps this is due to caching within HBase or within the streaming receiver.

Do you have any guidelines, best practices, documentation on how to tune queries for Kylin? (I'm aware of some cube tuning guidelines from the Kylin documentation, but now I'm looking for advice specifically about query optimization.)

Many thanks,
Andras