You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Billy Liu <bi...@apache.org> on 2018/06/02 06:11:06 UTC

Re: Kylin performance

Hi Dmitry,

This is an online query bottleneck diagnostic tools designed for Apache
Kylin: https://kybot.io  You could have a try, to figure out the bottleneck
from underline Hbase service, or not hit the right cuboid.

With Warm regards

Billy Liu


Kosmachev, Dmitry <DK...@luxoft.com> 于2018年5月29日周二 下午10:10写道:

> We have several cubes which are used for requests from our application.
>
> Sometimes our application slows down and we can see in Kylin logs that
> queries from this app runs hundreds of seconds. Also these queries are
> shown as Slow Queries in Kylin UI.
>
> The same performance when you try to run the same queries via Insight.
>
> But if you try to run the same queries later (in hour or two for instance)
> they run only several seconds (from app and from Insight). So if it stucks,
> it stucks everywhere. If it runs well, it runs well in Insight and from app.
>
>
>
> No evidences from HBase side and we don’t understand if what is the source
> of problem.
>
>
>
> Here is the example from log:
>
>
>
> 2018-05-29 16:43:44,097 DEBUG [BadQueryDetector]
> badquery.BadQueryHistoryManager:90 : Loaded 10 Bad Query(s)
>
> 2018-05-29 16:43:44,102 INFO  [BadQueryDetector]
> service.BadQueryDetector:170 : Problematic thread 0x5cc2f
>
>         at
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:828)
>
>         at
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>
>         at
> org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>
>         at Baz.bind(Unknown Source)
>
>         at
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:331)
>
>         at
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294)
>
>         at
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:553)
>
>         at
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:544)
>
>         at
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193)
>
>         at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>
>
>
> 2018-05-29 16:44:13,329 INFO  [Query
> f8464b34-589e-49c2-8f3a-648fa009309b-379951] service.QueryService:284 :
>
> ==========================[QUERY]===============================
>
> Query Id: f8464b34-589e-49c2-8f3a-648fa009309b
>
> SQL: SELECT QUERYID,
>
> CAST(SUM(EMPLOYERCNT) as float)/COUNT(PersonId) as AvgEmployer,
>
> MAX(EMPLOYERCNT) as MaxEmployer
>
> FROM
>
> (SELECT QUERYID, PersonId, MAX(EMPLOYERCNT) as EMPLOYERCNT
>
> FROM PERSON_VIEW_V2 where QueryId IN (1237)
>
> group by QUERYID, PersonId)
>
> WHERE EMPLOYERCNT IS NOT NULL AND EMPLOYERCNT > 0
>
> group by QUERYID
>
> User: ADMIN
>
> Success: true
>
> Duration: 119.237
>
> Project: scrm
>
> Realization Names: [CUBE[name=person_cube_v2_2]]
>
> Cuboid Ids: [768]
>
> Total scan count: 44354219
>
> Total scan bytes: 3183760803
>
> Result row count: 1
>
> Accept Partial: true
>
> Is Partial Result: false
>
> Hit Exception Cache: false
>
> Storage cache used: false
>
> Is Query Push-Down: false
>
> Message: null
>
> ==========================[QUERY]===============================
>
>
>
>
>
>
>
> *Dmitry Kosmachev*
> Senior Specialist
> Luxoft
>
>
>
> *From:* Kumar, Manoj H <ma...@jpmorgan.com>
> *Sent:* Tuesday, May 29, 2018 3:04 PM
> *To:* user@kylin.apache.org
> *Subject:* [Sender Auth Failure] RE: Kylin performance
>
>
>
> Can you pls. explain the issue in detail? Did you run query in “insight”
> section first? Its working fine there?
>
>
>
> Regards,
>
> Manoj
>
>
>
> *From:* Kosmachev, Dmitry [mailtoDKosmachev@luxoft.com]
> *Sent:* Tuesday, May 29, 2018 3:47 PM
> *To:* user@kylin.apache.org
> *Subject:* Kylin performance
>
>
>
> Hi!
>
> We have a strange situation with Kylin performance.
>
> Sometimes queries to Kylin last hundreds and thousands seconds.
> Considering that our web portal runs several of these queries when user
> uses portal this performance issue leads to the situation when web portal
> stops working at all.
>
> Sometimes the same queries last several seconds (without using Kylin
> cache). We can’t find any relations to HDFS or HBase performance. Are there
> any best practices to debug this situation and find the bottleneck? We
> don’t understand if Kylin or HBase is the reason.
>
> By now we discovered that our biggest cube has 500 regions in HBase. We
> have default kylin.hbase.region.cut parameter and Kylin 2.1 version. I
> guess it’s not normal situation to have hundrets of regions.
>
>
>
>
>
>
>
> *Dmitry Kosmachev*
>
> Senior Specialist
>
> *T:* +7 495 967 8030
>
> Luxoft Moscow
>
>
>
> IT
>
>
>
>
>
> luxoft.com <http://www.luxoft.com/r/main/>
>
>
>
>
>
>
>
>
>
>
> ------------------------------
>
>
> This e-mail and any attachment(s) are intended only for the recipient(s)
> named above and others who have been specifically authorized to receive
> them. They may contain confidential information. If you are not the
> intended recipient, please do not read this email or its attachment(s).
> Furthermore, you are hereby notified that any dissemination, distribution
> or copying of this e-mail and any attachment(s) is strictly prohibited. If
> you have received this e-mail in error, please immediately notify the
> sender by replying to this e-mail and then delete this e-mail and any
> attachment(s) or copies thereof from your system. Thank you.
>
> This message is confidential and subject to terms at: http://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>
> ------------------------------
>
> This e-mail and any attachment(s) are intended only for the recipient(s)
> named above and others who have been specifically authorized to receive
> them. They may contain confidential information. If you are not the
> intended recipient, please do not read this email or its attachment(s).
> Furthermore, you are hereby notified that any dissemination, distribution
> or copying of this e-mail and any attachment(s) is strictly prohibited. If
> you have received this e-mail in error, please immediately notify the
> sender by replying to this e-mail and then delete this e-mail and any
> attachment(s) or copies thereof from your system. Thank you.
>

Re: Kylin performance

Posted by Billy Liu <bi...@apache.org>.
I just check it again. The Kybot Client download URL has been fixed.

With Warm regards

Billy Liu


Kosmachev, Dmitry <DK...@luxoft.com> 于2018年6月5日周二 下午3:54写道:

> Hi Billy!
>
>
>
> Thanks a lot!
>
> As I can see the first step to start using this tool is to generate
> diagnostic package with KyBot Client. But the link for downloading this
> client heads to
> http://cn.kyligence.io/download/kybot/1.1.32/kybot-client-1.1.32-hbase1.x-bin.tar.gz
> with “Nothing found” message. Products page from this site does not include
> any links to client. Don’t you know are there any mirrors to download
> client package?
>
>
>
> BR
>
>
>
> *Dmitry Kosmachev*
> Senior Specialist
> Luxoft
>
>
>
> *From:* Billy Liu <bi...@apache.org>
> *Sent:* Saturday, June 2, 2018 9:11 AM
> *To:* user <us...@kylin.apache.org>
> *Subject:* Re: Kylin performance
>
>
>
> Hi Dmitry,
>
>
>
> This is an online query bottleneck diagnostic tools designed for Apache
> Kylin: https://kybot.io  You could have a try, to figure out the
> bottleneck from underline Hbase service, or not hit the right cuboid.
>
>
> With Warm regards
>
> Billy Liu
>
>
>
>
>
> Kosmachev, Dmitry <DK...@luxoft.com> 于2018年5月29日周二 下午10:10写道:
>
> We have several cubes which are used for requests from our application.
>
> Sometimes our application slows down and we can see in Kylin logs that
> queries from this app runs hundreds of seconds. Also these queries are
> shown as Slow Queries in Kylin UI.
>
> The same performance when you try to run the same queries via Insight.
>
> But if you try to run the same queries later (in hour or two for instance)
> they run only several seconds (from app and from Insight). So if it stucks,
> it stucks everywhere. If it runs well, it runs well in Insight and from app.
>
>
>
> No evidences from HBase side and we don’t understand if what is the source
> of problem.
>
>
>
> Here is the example from log:
>
>
>
> 2018-05-29 16:43:44,097 DEBUG [BadQueryDetector]
> badquery.BadQueryHistoryManager:90 : Loaded 10 Bad Query(s)
>
> 2018-05-29 16:43:44,102 INFO  [BadQueryDetector]
> service.BadQueryDetector:170 : Problematic thread 0x5cc2f
>
>         at
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:828)
>
>         at
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>
>         at
> org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>
>         at Baz.bind(Unknown Source)
>
>         at
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:331)
>
>         at
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294)
>
>         at
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:553)
>
>         at
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:544)
>
>         at
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193)
>
>         at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>
>
>
> 2018-05-29 16:44:13,329 INFO  [Query
> f8464b34-589e-49c2-8f3a-648fa009309b-379951] service.QueryService:284 :
>
> ==========================[QUERY]===============================
>
> Query Id: f8464b34-589e-49c2-8f3a-648fa009309b
>
> SQL: SELECT QUERYID,
>
> CAST(SUM(EMPLOYERCNT) as float)/COUNT(PersonId) as AvgEmployer,
>
> MAX(EMPLOYERCNT) as MaxEmployer
>
> FROM
>
> (SELECT QUERYID, PersonId, MAX(EMPLOYERCNT) as EMPLOYERCNT
>
> FROM PERSON_VIEW_V2 where QueryId IN (1237)
>
> group by QUERYID, PersonId)
>
> WHERE EMPLOYERCNT IS NOT NULL AND EMPLOYERCNT > 0
>
> group by QUERYID
>
> User: ADMIN
>
> Success: true
>
> Duration: 119.237
>
> Project: scrm
>
> Realization Names: [CUBE[name=person_cube_v2_2]]
>
> Cuboid Ids: [768]
>
> Total scan count: 44354219
>
> Total scan bytes: 3183760803
>
> Result row count: 1
>
> Accept Partial: true
>
> Is Partial Result: false
>
> Hit Exception Cache: false
>
> Storage cache used: false
>
> Is Query Push-Down: false
>
> Message: null
>
> ==========================[QUERY]===============================
>
>
>
>
>
>
>
> *Dmitry Kosmachev*
> Senior Specialist
> Luxoft
>
>
>
> *From:* Kumar, Manoj H <ma...@jpmorgan.com>
> *Sent:* Tuesday, May 29, 2018 3:04 PM
> *To:* user@kylin.apache.org
> *Subject:* [Sender Auth Failure] RE: Kylin performance
>
>
>
> Can you pls. explain the issue in detail? Did you run query in “insight”
> section first? Its working fine there?
>
>
>
> Regards,
>
> Manoj
>
>
>
> *From:* Kosmachev, Dmitry [mailtoDKosmachev@luxoft.com]
> *Sent:* Tuesday, May 29, 2018 3:47 PM
> *To:* user@kylin.apache.org
> *Subject:* Kylin performance
>
>
>
> Hi!
>
> We have a strange situation with Kylin performance.
>
> Sometimes queries to Kylin last hundreds and thousands seconds.
> Considering that our web portal runs several of these queries when user
> uses portal this performance issue leads to the situation when web portal
> stops working at all.
>
> Sometimes the same queries last several seconds (without using Kylin
> cache). We can’t find any relations to HDFS or HBase performance. Are there
> any best practices to debug this situation and find the bottleneck? We
> don’t understand if Kylin or HBase is the reason.
>
> By now we discovered that our biggest cube has 500 regions in HBase. We
> have default kylin.hbase.region.cut parameter and Kylin 2.1 version. I
> guess it’s not normal situation to have hundrets of regions.
>
>
>
>
>
>
>
> *Dmitry Kosmachev*
>
> Senior Specialist
>
> *T:* +7 495 967 8030
>
> Luxoft Moscow
>
>
>
> IT
>
>
>
>
>
> luxoft.com <http://www.luxoft.com/r/main/>
>
>
>
>
>
>
>
>
>
>
> ------------------------------
>
>
> This e-mail and any attachment(s) are intended only for the recipient(s)
> named above and others who have been specifically authorized to receive
> them. They may contain confidential information. If you are not the
> intended recipient, please do not read this email or its attachment(s).
> Furthermore, you are hereby notified that any dissemination, distribution
> or copying of this e-mail and any attachment(s) is strictly prohibited. If
> you have received this e-mail in error, please immediately notify the
> sender by replying to this e-mail and then delete this e-mail and any
> attachment(s) or copies thereof from your system. Thank you.
>
> This message is confidential and subject to terms at: http://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>
>
> ------------------------------
>
>
> This e-mail and any attachment(s) are intended only for the recipient(s)
> named above and others who have been specifically authorized to receive
> them. They may contain confidential information. If you are not the
> intended recipient, please do not read this email or its attachment(s).
> Furthermore, you are hereby notified that any dissemination, distribution
> or copying of this e-mail and any attachment(s) is strictly prohibited. If
> you have received this e-mail in error, please immediately notify the
> sender by replying to this e-mail and then delete this e-mail and any
> attachment(s) or copies thereof from your system. Thank you.
>
>
> ------------------------------
>
> This e-mail and any attachment(s) are intended only for the recipient(s)
> named above and others who have been specifically authorized to receive
> them. They may contain confidential information. If you are not the
> intended recipient, please do not read this email or its attachment(s).
> Furthermore, you are hereby notified that any dissemination, distribution
> or copying of this e-mail and any attachment(s) is strictly prohibited. If
> you have received this e-mail in error, please immediately notify the
> sender by replying to this e-mail and then delete this e-mail and any
> attachment(s) or copies thereof from your system. Thank you.
>

RE: Kylin performance

Posted by "Kosmachev, Dmitry" <DK...@luxoft.com>.
Hi Billy!

Thanks a lot!
As I can see the first step to start using this tool is to generate diagnostic package with KyBot Client. But the link for downloading this client heads to http://cn.kyligence.io/download/kybot/1.1.32/kybot-client-1.1.32-hbase1.x-bin.tar.gz with “Nothing found” message. Products page from this site does not include any links to client. Don’t you know are there any mirrors to download client package?

BR

Dmitry Kosmachev
Senior Specialist
Luxoft

From: Billy Liu <bi...@apache.org>
Sent: Saturday, June 2, 2018 9:11 AM
To: user <us...@kylin.apache.org>
Subject: Re: Kylin performance

Hi Dmitry,

This is an online query bottleneck diagnostic tools designed for Apache Kylin: https://kybot.io  You could have a try, to figure out the bottleneck from underline Hbase service, or not hit the right cuboid.

With Warm regards

Billy Liu


Kosmachev, Dmitry <DK...@luxoft.com>> 于2018年5月29日周二 下午10:10写道:
We have several cubes which are used for requests from our application.
Sometimes our application slows down and we can see in Kylin logs that queries from this app runs hundreds of seconds. Also these queries are shown as Slow Queries in Kylin UI.
The same performance when you try to run the same queries via Insight.
But if you try to run the same queries later (in hour or two for instance) they run only several seconds (from app and from Insight). So if it stucks, it stucks everywhere. If it runs well, it runs well in Insight and from app.

No evidences from HBase side and we don’t understand if what is the source of problem.

Here is the example from log:

2018-05-29 16:43:44,097 DEBUG [BadQueryDetector] badquery.BadQueryHistoryManager:90 : Loaded 10 Bad Query(s)
2018-05-29 16:43:44,102 INFO  [BadQueryDetector] service.BadQueryDetector:170 : Problematic thread 0x5cc2f
        at org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:828)
        at org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
        at org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
        at Baz.bind(Unknown Source)
        at org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:331)
        at org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294)
        at org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:553)
        at org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:544)
        at org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193)
        at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)

2018-05-29 16:44:13,329 INFO  [Query f8464b34-589e-49c2-8f3a-648fa009309b-379951] service.QueryService:284 :
==========================[QUERY]===============================
Query Id: f8464b34-589e-49c2-8f3a-648fa009309b
SQL: SELECT QUERYID,
CAST(SUM(EMPLOYERCNT) as float)/COUNT(PersonId) as AvgEmployer,
MAX(EMPLOYERCNT) as MaxEmployer
FROM
(SELECT QUERYID, PersonId, MAX(EMPLOYERCNT) as EMPLOYERCNT
FROM PERSON_VIEW_V2 where QueryId IN (1237)
group by QUERYID, PersonId)
WHERE EMPLOYERCNT IS NOT NULL AND EMPLOYERCNT > 0
group by QUERYID
User: ADMIN
Success: true
Duration: 119.237
Project: scrm
Realization Names: [CUBE[name=person_cube_v2_2]]
Cuboid Ids: [768]
Total scan count: 44354219
Total scan bytes: 3183760803
Result row count: 1
Accept Partial: true
Is Partial Result: false
Hit Exception Cache: false
Storage cache used: false
Is Query Push-Down: false
Message: null
==========================[QUERY]===============================



Dmitry Kosmachev
Senior Specialist
Luxoft

From: Kumar, Manoj H <ma...@jpmorgan.com>>
Sent: Tuesday, May 29, 2018 3:04 PM
To: user@kylin.apache.org<ma...@kylin.apache.org>
Subject: [Sender Auth Failure] RE: Kylin performance

Can you pls. explain the issue in detail? Did you run query in “insight” section first? Its working fine there?

Regards,
Manoj

From: Kosmachev, Dmitry [mailtoDKosmachev@luxoft.com<ma...@luxoft.com>]
Sent: Tuesday, May 29, 2018 3:47 PM
To: user@kylin.apache.org<ma...@kylin.apache.org>
Subject: Kylin performance

Hi!
We have a strange situation with Kylin performance.
Sometimes queries to Kylin last hundreds and thousands seconds. Considering that our web portal runs several of these queries when user uses portal this performance issue leads to the situation when web portal stops working at all.
Sometimes the same queries last several seconds (without using Kylin cache). We can’t find any relations to HDFS or HBase performance. Are there any best practices to debug this situation and find the bottleneck? We don’t understand if Kylin or HBase is the reason.
By now we discovered that our biggest cube has 500 regions in HBase. We have default kylin.hbase.region.cut parameter and Kylin 2.1 version. I guess it’s not normal situation to have hundrets of regions.



Dmitry Kosmachev

Senior Specialist

T: +7 495 967 8030

Luxoft Moscow



IT




luxoft.com<http://www.luxoft.com/r/main/>








________________________________

This e-mail and any attachment(s) are intended only for the recipient(s) named above and others who have been specifically authorized to receive them. They may contain confidential information. If you are not the intended recipient, please do not read this email or its attachment(s). Furthermore, you are hereby notified that any dissemination, distribution or copying of this e-mail and any attachment(s) is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender by replying to this e-mail and then delete this e-mail and any attachment(s) or copies thereof from your system. Thank you.

This message is confidential and subject to terms at: http://www.jpmorgan.com/emaildisclaimer<http://www.jpmorgan.com/emaildisclaimer> including on confidentiality, legal privilege, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

________________________________

This e-mail and any attachment(s) are intended only for the recipient(s) named above and others who have been specifically authorized to receive them. They may contain confidential information. If you are not the intended recipient, please do not read this email or its attachment(s). Furthermore, you are hereby notified that any dissemination, distribution or copying of this e-mail and any attachment(s) is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender by replying to this e-mail and then delete this e-mail and any attachment(s) or copies thereof from your system. Thank you.

________________________________

This e-mail and any attachment(s) are intended only for the recipient(s) named above and others who have been specifically authorized to receive them. They may contain confidential information. If you are not the intended recipient, please do not read this email or its attachment(s). Furthermore, you are hereby notified that any dissemination, distribution or copying of this e-mail and any attachment(s) is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender by replying to this e-mail and then delete this e-mail and any attachment(s) or copies thereof from your system. Thank you.