You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by zhong zhang <zz...@gmail.com> on 2016/01/28 00:34:47 UTC

kylin concurrency test

Hi All,

There is an article <http://www.bitstech.net/2016/01/04/kylin-olap/>posted
by @Hu Wei at Neteast which introduces the concurrency test results. In the
article, there is a throughput result graph. Please see the attached.
Based on my understanding, the x-axis is the number of Kylin server. What's
the y-axis? Is it the requests at the same time?

Best regards,
Zhong

Re: kylin concurrency test

Posted by hongbin ma <ma...@apache.org>.
hi,

the stats was only for reference, it was gathered from an early kylin
version.
also notice the stats may vary based on you case, so I think hands-on
exercise is necessary if you want to do a POC

kylin does not like detailed level with a lot of result records, because it
is heavy to transfer  in json result format, and it does not make much
sense for analysts.

On Thu, Jan 28, 2016 at 11:50 PM, zhong zhang <zz...@gmail.com> wrote:

> Hi Hongbin, Luke, Feng and everyone,
>
> Thanks so so much for taking time to see my thread and
> your kind help. Luke, thanks for introduce Feng to me.
>
> Feng, the concurrency test is vital for our application case. We definitely
> will
> use benchmark datasets to test it later. Currently, we'd like to have a
> broad
> understanding the capacity of Kylin. Can you help me answer the following
> questions for the throughput graph?
>
> (1) Parallel Thread #, 30 for high level aggregation query and 30 for
> detail
> level query. How many high level aggregation queries are used? Does 30
> parallel threads means 30 queries are triggered at the same time?
>
> (2) Does raw records mean the total records for all the queries?
>
> (3) For HBase scan, does the return less than scan mean there is
> no hit in the cube?
>
> (4) Latency, the min, max, median are the statistical results for
> all the test queries? what's 90% Line?
>
> (5) The throughput is 72.5/sec for high level. So each query takes
> about 13.8ms. This kind of contradicts to the min latency 67ms.
> Please correct me.
>
> Thanks once again for your help. Have a wonderful day!
>
> Best regards,
> Zhong
>
> On Wed, Jan 27, 2016 at 10:34 PM, yu feng <ol...@gmail.com> wrote:
>
> > Yes, It is QPS, this result comes from page 36 in Apache
> > Kylin-Hadoop上的大规模联机分析平台
> > <
> >
> http://events.linuxfoundation.org/sites/events/files/slides/Apache%20Kylin%202014%20Dec.pdf
> > >,
> > we do the same test in one and two kylin query node query result and
> > get similar result , so we use that picture for convenience, bottleneck
> of
> > kylin query throughput rely on hbase scan performance, which will related
> > to regionserver number and machine configuration, network etc.
> >
> >
> > 2016-01-28 10:08 GMT+08:00 Luke Han <lu...@gmail.com>:
> >
> > > It's QPS, please contact Yu Feng (kylin committer) from NetEase for
> more
> > > detail.
> > >
> > > Thanks.
> > > Luke
> > >
> > >
> > > Best Regards!
> > > ---------------------
> > >
> > > Luke Han
> > >
> > > On Thu, Jan 28, 2016 at 9:43 AM, hongbin ma <ma...@apache.org>
> > wrote:
> > >
> > > > i think by default it is QPS (queries per second)
> > > >
> > > > On Thu, Jan 28, 2016 at 7:34 AM, zhong zhang <zz...@gmail.com>
> > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > There is an article <
> http://www.bitstech.net/2016/01/04/kylin-olap/
> > > > >posted
> > > > > by @Hu Wei at Neteast which introduces the concurrency test
> results.
> > In
> > > > the
> > > > > article, there is a throughput result graph. Please see the
> attached.
> > > > > Based on my understanding, the x-axis is the number of Kylin
> server.
> > > > > What's the y-axis? Is it the requests at the same time?
> > > > >
> > > > > Best regards,
> > > > > Zhong
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > >
> > > > *Bin Mahone | 马洪宾*
> > > > Apache Kylin: http://kylin.io
> > > > Github: https://github.com/binmahone
> > > >
> > >
> >
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Re: kylin concurrency test

Posted by zhong zhang <zz...@gmail.com>.
Hi Hongbin, Luke, Feng and everyone,

Thanks so so much for taking time to see my thread and
your kind help. Luke, thanks for introduce Feng to me.

Feng, the concurrency test is vital for our application case. We definitely
will
use benchmark datasets to test it later. Currently, we'd like to have a
broad
understanding the capacity of Kylin. Can you help me answer the following
questions for the throughput graph?

(1) Parallel Thread #, 30 for high level aggregation query and 30 for
detail
level query. How many high level aggregation queries are used? Does 30
parallel threads means 30 queries are triggered at the same time?

(2) Does raw records mean the total records for all the queries?

(3) For HBase scan, does the return less than scan mean there is
no hit in the cube?

(4) Latency, the min, max, median are the statistical results for
all the test queries? what's 90% Line?

(5) The throughput is 72.5/sec for high level. So each query takes
about 13.8ms. This kind of contradicts to the min latency 67ms.
Please correct me.

Thanks once again for your help. Have a wonderful day!

Best regards,
Zhong

On Wed, Jan 27, 2016 at 10:34 PM, yu feng <ol...@gmail.com> wrote:

> Yes, It is QPS, this result comes from page 36 in Apache
> Kylin-Hadoop上的大规模联机分析平台
> <
> http://events.linuxfoundation.org/sites/events/files/slides/Apache%20Kylin%202014%20Dec.pdf
> >,
> we do the same test in one and two kylin query node query result and
> get similar result , so we use that picture for convenience, bottleneck of
> kylin query throughput rely on hbase scan performance, which will related
> to regionserver number and machine configuration, network etc.
>
>
> 2016-01-28 10:08 GMT+08:00 Luke Han <lu...@gmail.com>:
>
> > It's QPS, please contact Yu Feng (kylin committer) from NetEase for more
> > detail.
> >
> > Thanks.
> > Luke
> >
> >
> > Best Regards!
> > ---------------------
> >
> > Luke Han
> >
> > On Thu, Jan 28, 2016 at 9:43 AM, hongbin ma <ma...@apache.org>
> wrote:
> >
> > > i think by default it is QPS (queries per second)
> > >
> > > On Thu, Jan 28, 2016 at 7:34 AM, zhong zhang <zz...@gmail.com>
> wrote:
> > >
> > > > Hi All,
> > > >
> > > > There is an article <http://www.bitstech.net/2016/01/04/kylin-olap/
> > > >posted
> > > > by @Hu Wei at Neteast which introduces the concurrency test results.
> In
> > > the
> > > > article, there is a throughput result graph. Please see the attached.
> > > > Based on my understanding, the x-axis is the number of Kylin server.
> > > > What's the y-axis? Is it the requests at the same time?
> > > >
> > > > Best regards,
> > > > Zhong
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > *Bin Mahone | 马洪宾*
> > > Apache Kylin: http://kylin.io
> > > Github: https://github.com/binmahone
> > >
> >
>

Re: kylin concurrency test

Posted by yu feng <ol...@gmail.com>.
Yes, It is QPS, this result comes from page 36 in Apache
Kylin-Hadoop上的大规模联机分析平台
<http://events.linuxfoundation.org/sites/events/files/slides/Apache%20Kylin%202014%20Dec.pdf>,
we do the same test in one and two kylin query node query result and
get similar result , so we use that picture for convenience, bottleneck of
kylin query throughput rely on hbase scan performance, which will related
to regionserver number and machine configuration, network etc.


2016-01-28 10:08 GMT+08:00 Luke Han <lu...@gmail.com>:

> It's QPS, please contact Yu Feng (kylin committer) from NetEase for more
> detail.
>
> Thanks.
> Luke
>
>
> Best Regards!
> ---------------------
>
> Luke Han
>
> On Thu, Jan 28, 2016 at 9:43 AM, hongbin ma <ma...@apache.org> wrote:
>
> > i think by default it is QPS (queries per second)
> >
> > On Thu, Jan 28, 2016 at 7:34 AM, zhong zhang <zz...@gmail.com> wrote:
> >
> > > Hi All,
> > >
> > > There is an article <http://www.bitstech.net/2016/01/04/kylin-olap/
> > >posted
> > > by @Hu Wei at Neteast which introduces the concurrency test results. In
> > the
> > > article, there is a throughput result graph. Please see the attached.
> > > Based on my understanding, the x-axis is the number of Kylin server.
> > > What's the y-axis? Is it the requests at the same time?
> > >
> > > Best regards,
> > > Zhong
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > *Bin Mahone | 马洪宾*
> > Apache Kylin: http://kylin.io
> > Github: https://github.com/binmahone
> >
>

Re: kylin concurrency test

Posted by Luke Han <lu...@gmail.com>.
It's QPS, please contact Yu Feng (kylin committer) from NetEase for more
detail.

Thanks.
Luke


Best Regards!
---------------------

Luke Han

On Thu, Jan 28, 2016 at 9:43 AM, hongbin ma <ma...@apache.org> wrote:

> i think by default it is QPS (queries per second)
>
> On Thu, Jan 28, 2016 at 7:34 AM, zhong zhang <zz...@gmail.com> wrote:
>
> > Hi All,
> >
> > There is an article <http://www.bitstech.net/2016/01/04/kylin-olap/
> >posted
> > by @Hu Wei at Neteast which introduces the concurrency test results. In
> the
> > article, there is a throughput result graph. Please see the attached.
> > Based on my understanding, the x-axis is the number of Kylin server.
> > What's the y-axis? Is it the requests at the same time?
> >
> > Best regards,
> > Zhong
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>

Re: kylin concurrency test

Posted by hongbin ma <ma...@apache.org>.
i think by default it is QPS (queries per second)

On Thu, Jan 28, 2016 at 7:34 AM, zhong zhang <zz...@gmail.com> wrote:

> Hi All,
>
> There is an article <http://www.bitstech.net/2016/01/04/kylin-olap/>posted
> by @Hu Wei at Neteast which introduces the concurrency test results. In the
> article, there is a throughput result graph. Please see the attached.
> Based on my understanding, the x-axis is the number of Kylin server.
> What's the y-axis? Is it the requests at the same time?
>
> Best regards,
> Zhong
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone