You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by Mars J <xu...@gmail.com> on 2016/08/12 02:28:16 UTC

Kylin Query Performance

Hi,
    I have run Kylin 1.5.2.1 and build a cube successfully,the cube size is
3.2G for fact table and dimensional table have 1.5 million records
seperately.HTable count is about 200 billion.
   My Cube Design includes 3 Derived Dims and 5 Normal Dims which formed a
hierachy dim in agg. Rowkeys are generated automatically,the first and
second rowkey is the highest cardinality column in fact table and one dim
table seperately.

    When I query it from kylin insight, it costs 11s,and the second same
query is also 10+s, How can I optimize this ?

Re: Kylin Query Performance

Posted by Li Yang <li...@apache.org>.
Kylin caches recent query results. If the same query come in again and hits
cache, the the previous result is returned immediately. This may explain
why your second request of the same query is super fast.

On Fri, Aug 19, 2016 at 11:10 AM, Mars J <xu...@gmail.com> wrote:

> just 'select A,B from Fact f left join dima a on f.no=a.no...', when I
> query this, it needs 10+s, then the same sql will return result in 0.0s,
> but when I change the limit N or any column behind 'select' and 'group by'
> and join table ,it costs more than 10+s again.
> the source records in fact and dima is 160w , and count no in htable is
> 1.9billion.
>
> 2016-08-18 17:50 GMT+08:00 Li Yang <li...@apache.org>:
>
> > Performance troubleshoot is complicated and requires much information. If
> > you can share a diagnosis pack, people maybe able to help.
> >
> > On Sat, Aug 13, 2016 at 10:32 PM, Yiming Liu <li...@gmail.com>
> > wrote:
> >
> > > What's kind of queries? How many records supposed to be returned? Could
> > you
> > > send out the query log together?
> > >
> > > 2016-08-12 10:28 GMT+08:00 Mars J <xu...@gmail.com>:
> > >
> > > > Hi,
> > > >     I have run Kylin 1.5.2.1 and build a cube successfully,the cube
> > size
> > > is
> > > > 3.2G for fact table and dimensional table have 1.5 million records
> > > > seperately.HTable count is about 200 billion.
> > > >    My Cube Design includes 3 Derived Dims and 5 Normal Dims which
> > formed
> > > a
> > > > hierachy dim in agg. Rowkeys are generated automatically,the first
> and
> > > > second rowkey is the highest cardinality column in fact table and one
> > dim
> > > > table seperately.
> > > >
> > > >     When I query it from kylin insight, it costs 11s,and the second
> > same
> > > > query is also 10+s, How can I optimize this ?
> > > >
> > >
> > >
> > >
> > > --
> > > With Warm regards
> > >
> > > Yiming Liu (刘一鸣)
> > >
> >
>

Re: Kylin Query Performance

Posted by hongbin ma <ma...@apache.org>.
does your test query contain any aggregations? like min max sum. If no
aggregation exists, then the query will seek to scan base cuboid, which is
nearly as large as raw data in some bad cases.

On Fri, Aug 19, 2016 at 11:10 AM, Mars J <xu...@gmail.com> wrote:

> just 'select A,B from Fact f left join dima a on f.no=a.no...', when I
> query this, it needs 10+s, then the same sql will return result in 0.0s,
> but when I change the limit N or any column behind 'select' and 'group by'
> and join table ,it costs more than 10+s again.
> the source records in fact and dima is 160w , and count no in htable is
> 1.9billion.
>
> 2016-08-18 17:50 GMT+08:00 Li Yang <li...@apache.org>:
>
> > Performance troubleshoot is complicated and requires much information. If
> > you can share a diagnosis pack, people maybe able to help.
> >
> > On Sat, Aug 13, 2016 at 10:32 PM, Yiming Liu <li...@gmail.com>
> > wrote:
> >
> > > What's kind of queries? How many records supposed to be returned? Could
> > you
> > > send out the query log together?
> > >
> > > 2016-08-12 10:28 GMT+08:00 Mars J <xu...@gmail.com>:
> > >
> > > > Hi,
> > > >     I have run Kylin 1.5.2.1 and build a cube successfully,the cube
> > size
> > > is
> > > > 3.2G for fact table and dimensional table have 1.5 million records
> > > > seperately.HTable count is about 200 billion.
> > > >    My Cube Design includes 3 Derived Dims and 5 Normal Dims which
> > formed
> > > a
> > > > hierachy dim in agg. Rowkeys are generated automatically,the first
> and
> > > > second rowkey is the highest cardinality column in fact table and one
> > dim
> > > > table seperately.
> > > >
> > > >     When I query it from kylin insight, it costs 11s,and the second
> > same
> > > > query is also 10+s, How can I optimize this ?
> > > >
> > >
> > >
> > >
> > > --
> > > With Warm regards
> > >
> > > Yiming Liu (刘一鸣)
> > >
> >
>



-- 
Regards,

*Bin Mahone | 马洪宾*

Re: Kylin Query Performance

Posted by Mars J <xu...@gmail.com>.
just 'select A,B from Fact f left join dima a on f.no=a.no...', when I
query this, it needs 10+s, then the same sql will return result in 0.0s,
but when I change the limit N or any column behind 'select' and 'group by'
and join table ,it costs more than 10+s again.
the source records in fact and dima is 160w , and count no in htable is
1.9billion.

2016-08-18 17:50 GMT+08:00 Li Yang <li...@apache.org>:

> Performance troubleshoot is complicated and requires much information. If
> you can share a diagnosis pack, people maybe able to help.
>
> On Sat, Aug 13, 2016 at 10:32 PM, Yiming Liu <li...@gmail.com>
> wrote:
>
> > What's kind of queries? How many records supposed to be returned? Could
> you
> > send out the query log together?
> >
> > 2016-08-12 10:28 GMT+08:00 Mars J <xu...@gmail.com>:
> >
> > > Hi,
> > >     I have run Kylin 1.5.2.1 and build a cube successfully,the cube
> size
> > is
> > > 3.2G for fact table and dimensional table have 1.5 million records
> > > seperately.HTable count is about 200 billion.
> > >    My Cube Design includes 3 Derived Dims and 5 Normal Dims which
> formed
> > a
> > > hierachy dim in agg. Rowkeys are generated automatically,the first and
> > > second rowkey is the highest cardinality column in fact table and one
> dim
> > > table seperately.
> > >
> > >     When I query it from kylin insight, it costs 11s,and the second
> same
> > > query is also 10+s, How can I optimize this ?
> > >
> >
> >
> >
> > --
> > With Warm regards
> >
> > Yiming Liu (刘一鸣)
> >
>

Re: Kylin Query Performance

Posted by Li Yang <li...@apache.org>.
Performance troubleshoot is complicated and requires much information. If
you can share a diagnosis pack, people maybe able to help.

On Sat, Aug 13, 2016 at 10:32 PM, Yiming Liu <li...@gmail.com>
wrote:

> What's kind of queries? How many records supposed to be returned? Could you
> send out the query log together?
>
> 2016-08-12 10:28 GMT+08:00 Mars J <xu...@gmail.com>:
>
> > Hi,
> >     I have run Kylin 1.5.2.1 and build a cube successfully,the cube size
> is
> > 3.2G for fact table and dimensional table have 1.5 million records
> > seperately.HTable count is about 200 billion.
> >    My Cube Design includes 3 Derived Dims and 5 Normal Dims which formed
> a
> > hierachy dim in agg. Rowkeys are generated automatically,the first and
> > second rowkey is the highest cardinality column in fact table and one dim
> > table seperately.
> >
> >     When I query it from kylin insight, it costs 11s,and the second same
> > query is also 10+s, How can I optimize this ?
> >
>
>
>
> --
> With Warm regards
>
> Yiming Liu (刘一鸣)
>

Re: Kylin Query Performance

Posted by Yiming Liu <li...@gmail.com>.
What's kind of queries? How many records supposed to be returned? Could you
send out the query log together?

2016-08-12 10:28 GMT+08:00 Mars J <xu...@gmail.com>:

> Hi,
>     I have run Kylin 1.5.2.1 and build a cube successfully,the cube size is
> 3.2G for fact table and dimensional table have 1.5 million records
> seperately.HTable count is about 200 billion.
>    My Cube Design includes 3 Derived Dims and 5 Normal Dims which formed a
> hierachy dim in agg. Rowkeys are generated automatically,the first and
> second rowkey is the highest cardinality column in fact table and one dim
> table seperately.
>
>     When I query it from kylin insight, it costs 11s,and the second same
> query is also 10+s, How can I optimize this ?
>



-- 
With Warm regards

Yiming Liu (刘一鸣)