You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kylin.apache.org by vishnu <ps...@gmail.com> on 2018/09/12 09:06:03 UTC

apache kylin vs apache druid

In what context apache kylin is better than apache druid? or vice-versa.

--
Sent from: http://apache-kylin.74782.x6.nabble.com/

Re: apache kylin vs apache druid

Posted by ShaoFeng Shi <sh...@apache.org>.

You can check previous discussion:

https://mail-archives.apache.org/mod_mbox/kylin-dev/201503.mbox/%3CCAKmQrOY0fjZLUU0MGo5aajZ2uLb3T0qJknHQd+Wv1oxd5PKixQ@mail.gmail.com%3E


Zhong, Yanghong <ya...@ebay.com> 于2018年9月12日周三 下午6:05写道：

> Hi Vishnu,
>
> Suppose there are 20 dimensions, d1,d2...n20 and you have prebuild cuboid
> [d1,d2]. The cuboid [d1, d2, ..., d20] owns 1 billion rows, while [d1,d2]
> owns 1k rows. Obviously if your query hits cuboid [d1, d2], it's better to
> use [d1,d2] to answer your query in Kylin rather than [d1, d2, ..., d20] in
> Druid to answer your query.
>
> Another strong point of using Kylin is it's better for range filtering as
> it utilizes sorted key value store, HBase.
>
>
> Druid has obvious advantage at its scan performance if the query related
> columns are not too many by the following two aspects:
> - it utilizes columnar storage for effective storing
> - before query, the data should be all loaded into memory
>
> Best regards,
> Yanghong Zhong
>
>
> On 9/12/18, 5:44 PM, "vishnu" <ps...@gmail.com> wrote:
>
>     In what context apache kylin is better than apache druid? or
> vice-versa.
>
>     --
>     Sent from:
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-kylin.74782.x6.nabble.com%2F&amp;data=02%7C01%7Cyangzhong%40ebay.com%7C3e89491dea5c49a69a7a08d6189460cc%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636723422885782475&amp;sdata=iFql6SQBCDHslVj7ggM78eOvs2vOP53HDj5ZYnuKJsg%3D&amp;reserved=0
>
>
>

-- 
Best regards,

Shaofeng Shi 史少锋

Re: apache kylin vs apache druid

Posted by "Zhong, Yanghong" <ya...@ebay.com>.

Hi Vishnu,

Suppose there are 20 dimensions, d1,d2...n20 and you have prebuild cuboid [d1,d2]. The cuboid [d1, d2, ..., d20] owns 1 billion rows, while [d1,d2] owns 1k rows. Obviously if your query hits cuboid [d1, d2], it's better to use [d1,d2] to answer your query in Kylin rather than [d1, d2, ..., d20] in Druid to answer your query. 

Another strong point of using Kylin is it's better for range filtering as it utilizes sorted key value store, HBase.


Druid has obvious advantage at its scan performance if the query related columns are not too many by the following two aspects:
- it utilizes columnar storage for effective storing
- before query, the data should be all loaded into memory

Best regards,
Yanghong Zhong


On 9/12/18, 5:44 PM, "vishnu" <ps...@gmail.com> wrote:

    In what context apache kylin is better than apache druid? or vice-versa.
    
    --
    Sent from: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-kylin.74782.x6.nabble.com%2F&amp;data=02%7C01%7Cyangzhong%40ebay.com%7C3e89491dea5c49a69a7a08d6189460cc%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636723422885782475&amp;sdata=iFql6SQBCDHslVj7ggM78eOvs2vOP53HDj5ZYnuKJsg%3D&amp;reserved=0