You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kylin.apache.org by vipul jhawar <vi...@gmail.com> on 2015/09/21 16:42:12 UTC

coprocessor cause 100% cpu

Hi

Have noticed a pattern that which caused the co processor to spike the
regionserver cpu to 100% over time.
If we end up issuing a query thru kylin which may involve a scanning a lot
of data assuming multiple days with multiple filters for many dimensions in
which case it has to scan a large number of rows and if it doesnt return in
the required rpc timeout then the client does get an error message with the
exception, but on the regionserver we see no end to processing and it
ultimately hogs the regionserver.

Are there any configs on the coprocessor which can be configured to say
that if the processing is not completed in N time, then simply timeout as
that way we can look at the queries later but avoid cpu spike as it makes
the cluster unusable.

Thanks

Re: coprocessor cause 100% cpu

Posted by Li Yang <li...@apache.org>.

Despite the great finding on fuzzy filter, there has to be a protection
mechanism to prevent coprocessor from exhausting mem or cpu of region
server after client timeout.  Marking a JIRA for this.

https://issues.apache.org/jira/browse/KYLIN-1050




On Wed, Sep 23, 2015 at 4:46 PM, hongbin ma <ma...@apache.org> wrote:

> If you have many filters or IN clauses in your query, Kylin will generate a
> lot of fuzzy keys for hbase scan. A proper amount of fuzzy keys will be
> beneficial for hbase scanning, but when the number of fuzzy keys grow too
> large, the performance of scanning will dramatically degrade, as
> FuzzyKeyFilter will explore a large space of possibilities, and there is no
> easy way to overcome this issue, see my patch to hbase at:
>
> https://issues.apache.org/jira/browse/HBASE-14269
>
> The side-effect is the high CPU usage you're observing.
>
> so in  https://issues.apache.org/jira/browse/KYLIN-740, whenever we find
> there're too many fuzzy filters generated(by using a magic number as
> threshold), we'll discard them all, and scan hbase without any fuzzy keys.
>
> hope this is useful to you
>
>
>
>
>
>
>
>
> On Tue, Sep 22, 2015 at 11:26 PM, vipul jhawar <vi...@gmail.com>
> wrote:
>
> > Looks like attachments are stripped off the email.
> > Here is a screenshot -
> > https://monosnap.com/file/JmpHEMxJVVQUhTLxTrzE1sWDn7gXg4
> >
> > On Tue, Sep 22, 2015 at 5:32 PM, vipul jhawar <vi...@gmail.com>
> > wrote:
> >
> > > Hi hongbin
> > >
> > > It is attached in the previous reply.
> > > Attached again.
> > >
> > > Thanks
> > >
> > > On Tue, Sep 22, 2015 at 11:58 AM, hongbin ma <ma...@apache.org>
> > wrote:
> > >
> > >> hi
> > >>
> > >> did you forget to attach the screenshot?
> > >>
> > >> On Tue, Sep 22, 2015 at 12:11 PM, vipul jhawar <
> vipul.jhawar@gmail.com>
> > >> wrote:
> > >>
> > >> > Hi
> > >> >
> > >> > We are kylin 0.7.2 .
> > >> > A screenshot of the call stack is attached for reference.
> > >> >
> > >> > Yesterday we have done some more debugging and we added a timeout
> > check
> > >> in
> > >> > co processor AggregationScanner -> buildAggrCache
> > >> > similar to checkMemoryUsage() check in the co processor but when we
> > >> > enabled fuzzy keys it simply remains stuck for hours.
> > >> > It's not even looping as even when we added timeout checks of 1 min,
> > the
> > >> > timeout never happened but the co processor was hung for a long time
> > >> and we
> > >> > had to bounce the regionserver. If you could explain what is causing
> > >> the co
> > >> > processor to remain hung for so long and not even loop in. Is it
> just
> > >> stuck
> > >> > on the scan forever.
> > >> >
> > >> > After this when we disable the fuzzy keys, the timeout does get
> > >> executed.
> > >> > On further analysis we tried to reduce the fuzzy_value_cap and
> brought
> > >> it
> > >> > down to 20.
> > >> > The problem is that when we switch on fuzzy and have filters which
> > lead
> > >> to
> > >> > IN clause, the co processor is not deterministic and it goes into a
> > spin
> > >> > sometimes and it executes fine sometimes which becomes an issue as
> we
> > >> need
> > >> > deterministic performance and do not want to co processor to be
> > running
> > >> for
> > >> > ever. Some queries run fine and are very fast and some just get
> stuck
> > >> > forever.
> > >> >
> > >> > The client time out with an rpc timeout but the co processor thread
> > just
> > >> > hogs the CPU.
> > >> >
> > >> > Please comment.
> > >> >
> > >> > Thanks
> > >> >
> > >> >
> > >> > On Tue, Sep 22, 2015 at 7:14 AM, hongbin ma <ma...@apache.org>
> > >> wrote:
> > >> >
> > >> >> hi vipul,
> > >> >>
> > >> >> what version are you using? before
> > >> >> https://issues.apache.org/jira/browse/KYLIN-740 we did spot some
> > >> critical
> > >> >> performance issues caused by many IN clauses, if you could help to
> > >> provide
> > >> >> a CPU/heap analysis(on your hbase's region server) it would be
> easier
> > >> to
> > >> >> address the problem.
> > >> >>
> > >> >> On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <
> > vipul.jhawar@gmail.com
> > >> >
> > >> >> wrote:
> > >> >>
> > >> >> > Hi
> > >> >> >
> > >> >> > Have noticed a pattern that which caused the co processor to
> spike
> > >> the
> > >> >> > regionserver cpu to 100% over time.
> > >> >> > If we end up issuing a query thru kylin which may involve a
> > scanning
> > >> a
> > >> >> lot
> > >> >> > of data assuming multiple days with multiple filters for many
> > >> >> dimensions in
> > >> >> > which case it has to scan a large number of rows and if it doesnt
> > >> >> return in
> > >> >> > the required rpc timeout then the client does get an error
> message
> > >> with
> > >> >> the
> > >> >> > exception, but on the regionserver we see no end to processing
> and
> > it
> > >> >> > ultimately hogs the regionserver.
> > >> >> >
> > >> >> > Are there any configs on the coprocessor which can be configured
> to
> > >> say
> > >> >> > that if the processing is not completed in N time, then simply
> > >> timeout
> > >> >> as
> > >> >> > that way we can look at the queries later but avoid cpu spike as
> it
> > >> >> makes
> > >> >> > the cluster unusable.
> > >> >> >
> > >> >> > Thanks
> > >> >> >
> > >> >>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> Regards,
> > >> >>
> > >> >> *Bin Mahone | 马洪宾*
> > >> >> Apache Kylin: http://kylin.io
> > >> >> Github: https://github.com/binmahone
> > >> >>
> > >> >
> > >> >
> > >>
> > >>
> > >> --
> > >> Regards,
> > >>
> > >> *Bin Mahone | 马洪宾*
> > >> Apache Kylin: http://kylin.io
> > >> Github: https://github.com/binmahone
> > >>
> > >
> > >
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>

Re: coprocessor cause 100% cpu

Posted by hongbin ma <ma...@apache.org>.

If you have many filters or IN clauses in your query, Kylin will generate a
lot of fuzzy keys for hbase scan. A proper amount of fuzzy keys will be
beneficial for hbase scanning, but when the number of fuzzy keys grow too
large, the performance of scanning will dramatically degrade, as
FuzzyKeyFilter will explore a large space of possibilities, and there is no
easy way to overcome this issue, see my patch to hbase at:

https://issues.apache.org/jira/browse/HBASE-14269

The side-effect is the high CPU usage you're observing.

so in  https://issues.apache.org/jira/browse/KYLIN-740, whenever we find
there're too many fuzzy filters generated(by using a magic number as
threshold), we'll discard them all, and scan hbase without any fuzzy keys.

hope this is useful to you








On Tue, Sep 22, 2015 at 11:26 PM, vipul jhawar <vi...@gmail.com>
wrote:

> Looks like attachments are stripped off the email.
> Here is a screenshot -
> https://monosnap.com/file/JmpHEMxJVVQUhTLxTrzE1sWDn7gXg4
>
> On Tue, Sep 22, 2015 at 5:32 PM, vipul jhawar <vi...@gmail.com>
> wrote:
>
> > Hi hongbin
> >
> > It is attached in the previous reply.
> > Attached again.
> >
> > Thanks
> >
> > On Tue, Sep 22, 2015 at 11:58 AM, hongbin ma <ma...@apache.org>
> wrote:
> >
> >> hi
> >>
> >> did you forget to attach the screenshot?
> >>
> >> On Tue, Sep 22, 2015 at 12:11 PM, vipul jhawar <vi...@gmail.com>
> >> wrote:
> >>
> >> > Hi
> >> >
> >> > We are kylin 0.7.2 .
> >> > A screenshot of the call stack is attached for reference.
> >> >
> >> > Yesterday we have done some more debugging and we added a timeout
> check
> >> in
> >> > co processor AggregationScanner -> buildAggrCache
> >> > similar to checkMemoryUsage() check in the co processor but when we
> >> > enabled fuzzy keys it simply remains stuck for hours.
> >> > It's not even looping as even when we added timeout checks of 1 min,
> the
> >> > timeout never happened but the co processor was hung for a long time
> >> and we
> >> > had to bounce the regionserver. If you could explain what is causing
> >> the co
> >> > processor to remain hung for so long and not even loop in. Is it just
> >> stuck
> >> > on the scan forever.
> >> >
> >> > After this when we disable the fuzzy keys, the timeout does get
> >> executed.
> >> > On further analysis we tried to reduce the fuzzy_value_cap and brought
> >> it
> >> > down to 20.
> >> > The problem is that when we switch on fuzzy and have filters which
> lead
> >> to
> >> > IN clause, the co processor is not deterministic and it goes into a
> spin
> >> > sometimes and it executes fine sometimes which becomes an issue as we
> >> need
> >> > deterministic performance and do not want to co processor to be
> running
> >> for
> >> > ever. Some queries run fine and are very fast and some just get stuck
> >> > forever.
> >> >
> >> > The client time out with an rpc timeout but the co processor thread
> just
> >> > hogs the CPU.
> >> >
> >> > Please comment.
> >> >
> >> > Thanks
> >> >
> >> >
> >> > On Tue, Sep 22, 2015 at 7:14 AM, hongbin ma <ma...@apache.org>
> >> wrote:
> >> >
> >> >> hi vipul,
> >> >>
> >> >> what version are you using? before
> >> >> https://issues.apache.org/jira/browse/KYLIN-740 we did spot some
> >> critical
> >> >> performance issues caused by many IN clauses, if you could help to
> >> provide
> >> >> a CPU/heap analysis(on your hbase's region server) it would be easier
> >> to
> >> >> address the problem.
> >> >>
> >> >> On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <
> vipul.jhawar@gmail.com
> >> >
> >> >> wrote:
> >> >>
> >> >> > Hi
> >> >> >
> >> >> > Have noticed a pattern that which caused the co processor to spike
> >> the
> >> >> > regionserver cpu to 100% over time.
> >> >> > If we end up issuing a query thru kylin which may involve a
> scanning
> >> a
> >> >> lot
> >> >> > of data assuming multiple days with multiple filters for many
> >> >> dimensions in
> >> >> > which case it has to scan a large number of rows and if it doesnt
> >> >> return in
> >> >> > the required rpc timeout then the client does get an error message
> >> with
> >> >> the
> >> >> > exception, but on the regionserver we see no end to processing and
> it
> >> >> > ultimately hogs the regionserver.
> >> >> >
> >> >> > Are there any configs on the coprocessor which can be configured to
> >> say
> >> >> > that if the processing is not completed in N time, then simply
> >> timeout
> >> >> as
> >> >> > that way we can look at the queries later but avoid cpu spike as it
> >> >> makes
> >> >> > the cluster unusable.
> >> >> >
> >> >> > Thanks
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Regards,
> >> >>
> >> >> *Bin Mahone | 马洪宾*
> >> >> Apache Kylin: http://kylin.io
> >> >> Github: https://github.com/binmahone
> >> >>
> >> >
> >> >
> >>
> >>
> >> --
> >> Regards,
> >>
> >> *Bin Mahone | 马洪宾*
> >> Apache Kylin: http://kylin.io
> >> Github: https://github.com/binmahone
> >>
> >
> >
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Re: coprocessor cause 100% cpu

Posted by vipul jhawar <vi...@gmail.com>.

Looks like attachments are stripped off the email.
Here is a screenshot -
https://monosnap.com/file/JmpHEMxJVVQUhTLxTrzE1sWDn7gXg4

On Tue, Sep 22, 2015 at 5:32 PM, vipul jhawar <vi...@gmail.com>
wrote:

> Hi hongbin
>
> It is attached in the previous reply.
> Attached again.
>
> Thanks
>
> On Tue, Sep 22, 2015 at 11:58 AM, hongbin ma <ma...@apache.org> wrote:
>
>> hi
>>
>> did you forget to attach the screenshot?
>>
>> On Tue, Sep 22, 2015 at 12:11 PM, vipul jhawar <vi...@gmail.com>
>> wrote:
>>
>> > Hi
>> >
>> > We are kylin 0.7.2 .
>> > A screenshot of the call stack is attached for reference.
>> >
>> > Yesterday we have done some more debugging and we added a timeout check
>> in
>> > co processor AggregationScanner -> buildAggrCache
>> > similar to checkMemoryUsage() check in the co processor but when we
>> > enabled fuzzy keys it simply remains stuck for hours.
>> > It's not even looping as even when we added timeout checks of 1 min, the
>> > timeout never happened but the co processor was hung for a long time
>> and we
>> > had to bounce the regionserver. If you could explain what is causing
>> the co
>> > processor to remain hung for so long and not even loop in. Is it just
>> stuck
>> > on the scan forever.
>> >
>> > After this when we disable the fuzzy keys, the timeout does get
>> executed.
>> > On further analysis we tried to reduce the fuzzy_value_cap and brought
>> it
>> > down to 20.
>> > The problem is that when we switch on fuzzy and have filters which lead
>> to
>> > IN clause, the co processor is not deterministic and it goes into a spin
>> > sometimes and it executes fine sometimes which becomes an issue as we
>> need
>> > deterministic performance and do not want to co processor to be running
>> for
>> > ever. Some queries run fine and are very fast and some just get stuck
>> > forever.
>> >
>> > The client time out with an rpc timeout but the co processor thread just
>> > hogs the CPU.
>> >
>> > Please comment.
>> >
>> > Thanks
>> >
>> >
>> > On Tue, Sep 22, 2015 at 7:14 AM, hongbin ma <ma...@apache.org>
>> wrote:
>> >
>> >> hi vipul,
>> >>
>> >> what version are you using? before
>> >> https://issues.apache.org/jira/browse/KYLIN-740 we did spot some
>> critical
>> >> performance issues caused by many IN clauses, if you could help to
>> provide
>> >> a CPU/heap analysis(on your hbase's region server) it would be easier
>> to
>> >> address the problem.
>> >>
>> >> On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <vipul.jhawar@gmail.com
>> >
>> >> wrote:
>> >>
>> >> > Hi
>> >> >
>> >> > Have noticed a pattern that which caused the co processor to spike
>> the
>> >> > regionserver cpu to 100% over time.
>> >> > If we end up issuing a query thru kylin which may involve a scanning
>> a
>> >> lot
>> >> > of data assuming multiple days with multiple filters for many
>> >> dimensions in
>> >> > which case it has to scan a large number of rows and if it doesnt
>> >> return in
>> >> > the required rpc timeout then the client does get an error message
>> with
>> >> the
>> >> > exception, but on the regionserver we see no end to processing and it
>> >> > ultimately hogs the regionserver.
>> >> >
>> >> > Are there any configs on the coprocessor which can be configured to
>> say
>> >> > that if the processing is not completed in N time, then simply
>> timeout
>> >> as
>> >> > that way we can look at the queries later but avoid cpu spike as it
>> >> makes
>> >> > the cluster unusable.
>> >> >
>> >> > Thanks
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >>
>> >> *Bin Mahone | 马洪宾*
>> >> Apache Kylin: http://kylin.io
>> >> Github: https://github.com/binmahone
>> >>
>> >
>> >
>>
>>
>> --
>> Regards,
>>
>> *Bin Mahone | 马洪宾*
>> Apache Kylin: http://kylin.io
>> Github: https://github.com/binmahone
>>
>
>

Re: coprocessor cause 100% cpu

Posted by vipul jhawar <vi...@gmail.com>.

Hi hongbin

It is attached in the previous reply.
Attached again.

Thanks

On Tue, Sep 22, 2015 at 11:58 AM, hongbin ma <ma...@apache.org> wrote:

> hi
>
> did you forget to attach the screenshot?
>
> On Tue, Sep 22, 2015 at 12:11 PM, vipul jhawar <vi...@gmail.com>
> wrote:
>
> > Hi
> >
> > We are kylin 0.7.2 .
> > A screenshot of the call stack is attached for reference.
> >
> > Yesterday we have done some more debugging and we added a timeout check
> in
> > co processor AggregationScanner -> buildAggrCache
> > similar to checkMemoryUsage() check in the co processor but when we
> > enabled fuzzy keys it simply remains stuck for hours.
> > It's not even looping as even when we added timeout checks of 1 min, the
> > timeout never happened but the co processor was hung for a long time and
> we
> > had to bounce the regionserver. If you could explain what is causing the
> co
> > processor to remain hung for so long and not even loop in. Is it just
> stuck
> > on the scan forever.
> >
> > After this when we disable the fuzzy keys, the timeout does get executed.
> > On further analysis we tried to reduce the fuzzy_value_cap and brought it
> > down to 20.
> > The problem is that when we switch on fuzzy and have filters which lead
> to
> > IN clause, the co processor is not deterministic and it goes into a spin
> > sometimes and it executes fine sometimes which becomes an issue as we
> need
> > deterministic performance and do not want to co processor to be running
> for
> > ever. Some queries run fine and are very fast and some just get stuck
> > forever.
> >
> > The client time out with an rpc timeout but the co processor thread just
> > hogs the CPU.
> >
> > Please comment.
> >
> > Thanks
> >
> >
> > On Tue, Sep 22, 2015 at 7:14 AM, hongbin ma <ma...@apache.org>
> wrote:
> >
> >> hi vipul,
> >>
> >> what version are you using? before
> >> https://issues.apache.org/jira/browse/KYLIN-740 we did spot some
> critical
> >> performance issues caused by many IN clauses, if you could help to
> provide
> >> a CPU/heap analysis(on your hbase's region server) it would be easier to
> >> address the problem.
> >>
> >> On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <vi...@gmail.com>
> >> wrote:
> >>
> >> > Hi
> >> >
> >> > Have noticed a pattern that which caused the co processor to spike the
> >> > regionserver cpu to 100% over time.
> >> > If we end up issuing a query thru kylin which may involve a scanning a
> >> lot
> >> > of data assuming multiple days with multiple filters for many
> >> dimensions in
> >> > which case it has to scan a large number of rows and if it doesnt
> >> return in
> >> > the required rpc timeout then the client does get an error message
> with
> >> the
> >> > exception, but on the regionserver we see no end to processing and it
> >> > ultimately hogs the regionserver.
> >> >
> >> > Are there any configs on the coprocessor which can be configured to
> say
> >> > that if the processing is not completed in N time, then simply timeout
> >> as
> >> > that way we can look at the queries later but avoid cpu spike as it
> >> makes
> >> > the cluster unusable.
> >> >
> >> > Thanks
> >> >
> >>
> >>
> >>
> >> --
> >> Regards,
> >>
> >> *Bin Mahone | 马洪宾*
> >> Apache Kylin: http://kylin.io
> >> Github: https://github.com/binmahone
> >>
> >
> >
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>

Re: coprocessor cause 100% cpu

Posted by hongbin ma <ma...@apache.org>.

hi

did you forget to attach the screenshot?

On Tue, Sep 22, 2015 at 12:11 PM, vipul jhawar <vi...@gmail.com>
wrote:

> Hi
>
> We are kylin 0.7.2 .
> A screenshot of the call stack is attached for reference.
>
> Yesterday we have done some more debugging and we added a timeout check in
> co processor AggregationScanner -> buildAggrCache
> similar to checkMemoryUsage() check in the co processor but when we
> enabled fuzzy keys it simply remains stuck for hours.
> It's not even looping as even when we added timeout checks of 1 min, the
> timeout never happened but the co processor was hung for a long time and we
> had to bounce the regionserver. If you could explain what is causing the co
> processor to remain hung for so long and not even loop in. Is it just stuck
> on the scan forever.
>
> After this when we disable the fuzzy keys, the timeout does get executed.
> On further analysis we tried to reduce the fuzzy_value_cap and brought it
> down to 20.
> The problem is that when we switch on fuzzy and have filters which lead to
> IN clause, the co processor is not deterministic and it goes into a spin
> sometimes and it executes fine sometimes which becomes an issue as we need
> deterministic performance and do not want to co processor to be running for
> ever. Some queries run fine and are very fast and some just get stuck
> forever.
>
> The client time out with an rpc timeout but the co processor thread just
> hogs the CPU.
>
> Please comment.
>
> Thanks
>
>
> On Tue, Sep 22, 2015 at 7:14 AM, hongbin ma <ma...@apache.org> wrote:
>
>> hi vipul,
>>
>> what version are you using? before
>> https://issues.apache.org/jira/browse/KYLIN-740 we did spot some critical
>> performance issues caused by many IN clauses, if you could help to provide
>> a CPU/heap analysis(on your hbase's region server) it would be easier to
>> address the problem.
>>
>> On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <vi...@gmail.com>
>> wrote:
>>
>> > Hi
>> >
>> > Have noticed a pattern that which caused the co processor to spike the
>> > regionserver cpu to 100% over time.
>> > If we end up issuing a query thru kylin which may involve a scanning a
>> lot
>> > of data assuming multiple days with multiple filters for many
>> dimensions in
>> > which case it has to scan a large number of rows and if it doesnt
>> return in
>> > the required rpc timeout then the client does get an error message with
>> the
>> > exception, but on the regionserver we see no end to processing and it
>> > ultimately hogs the regionserver.
>> >
>> > Are there any configs on the coprocessor which can be configured to say
>> > that if the processing is not completed in N time, then simply timeout
>> as
>> > that way we can look at the queries later but avoid cpu spike as it
>> makes
>> > the cluster unusable.
>> >
>> > Thanks
>> >
>>
>>
>>
>> --
>> Regards,
>>
>> *Bin Mahone | 马洪宾*
>> Apache Kylin: http://kylin.io
>> Github: https://github.com/binmahone
>>
>
>


-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Re: coprocessor cause 100% cpu

Posted by vipul jhawar <vi...@gmail.com>.

Hi

We are kylin 0.7.2 .
A screenshot of the call stack is attached for reference.

Yesterday we have done some more debugging and we added a timeout check in
co processor AggregationScanner -> buildAggrCache
similar to checkMemoryUsage() check in the co processor but when we enabled
fuzzy keys it simply remains stuck for hours.
It's not even looping as even when we added timeout checks of 1 min, the
timeout never happened but the co processor was hung for a long time and we
had to bounce the regionserver. If you could explain what is causing the co
processor to remain hung for so long and not even loop in. Is it just stuck
on the scan forever.

After this when we disable the fuzzy keys, the timeout does get executed.
On further analysis we tried to reduce the fuzzy_value_cap and brought it
down to 20.
The problem is that when we switch on fuzzy and have filters which lead to
IN clause, the co processor is not deterministic and it goes into a spin
sometimes and it executes fine sometimes which becomes an issue as we need
deterministic performance and do not want to co processor to be running for
ever. Some queries run fine and are very fast and some just get stuck
forever.

The client time out with an rpc timeout but the co processor thread just
hogs the CPU.

Please comment.

Thanks

On Tue, Sep 22, 2015 at 7:14 AM, hongbin ma <ma...@apache.org> wrote:

> hi vipul,
>
> what version are you using? before
> https://issues.apache.org/jira/browse/KYLIN-740 we did spot some critical
> performance issues caused by many IN clauses, if you could help to provide
> a CPU/heap analysis(on your hbase's region server) it would be easier to
> address the problem.
>
> On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <vi...@gmail.com>
> wrote:
>
> > Hi
> >
> > Have noticed a pattern that which caused the co processor to spike the
> > regionserver cpu to 100% over time.
> > If we end up issuing a query thru kylin which may involve a scanning a
> lot
> > of data assuming multiple days with multiple filters for many dimensions
> in
> > which case it has to scan a large number of rows and if it doesnt return
> in
> > the required rpc timeout then the client does get an error message with
> the
> > exception, but on the regionserver we see no end to processing and it
> > ultimately hogs the regionserver.
> >
> > Are there any configs on the coprocessor which can be configured to say
> > that if the processing is not completed in N time, then simply timeout as
> > that way we can look at the queries later but avoid cpu spike as it makes
> > the cluster unusable.
> >
> > Thanks
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>

Re: coprocessor cause 100% cpu

Posted by hongbin ma <ma...@apache.org>.

hi vipul,

what version are you using? before
https://issues.apache.org/jira/browse/KYLIN-740 we did spot some critical
performance issues caused by many IN clauses, if you could help to provide
a CPU/heap analysis(on your hbase's region server) it would be easier to
address the problem.

On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <vi...@gmail.com>
wrote:

> Hi
>
> Have noticed a pattern that which caused the co processor to spike the
> regionserver cpu to 100% over time.
> If we end up issuing a query thru kylin which may involve a scanning a lot
> of data assuming multiple days with multiple filters for many dimensions in
> which case it has to scan a large number of rows and if it doesnt return in
> the required rpc timeout then the client does get an error message with the
> exception, but on the regionserver we see no end to processing and it
> ultimately hogs the regionserver.
>
> Are there any configs on the coprocessor which can be configured to say
> that if the processing is not completed in N time, then simply timeout as
> that way we can look at the queries later but avoid cpu spike as it makes
> the cluster unusable.
>
> Thanks
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone