You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by lufeng <am...@163.com> on 2017/04/05 10:17:57 UTC

Difference between using ES and Kylin as real-time OLAP engine?

Hi All

Now if I want to build a OLAP platform to analyze our huge data, the first impression is using Hive,but it can not match our real-time query needs. Then I found lots of companies used ES to build their OLAP engine before like tencent’s Hermes. So I want to know that  what’s the difference between ES and Apache Kylin from different dimensions like filter, aggregation queries or etc. 

* Performance 
* Flexible queries
* Data Model
* BI integration 
* …

After Kylin release latest architecture that can support different storage engine, I found that Kylin will support ES[1] as it’s backend storage engine. So I think ES is ROLAP engine that lots of people use ES as NoSQL DB and Kylin is a MOLAP engine. 

I would love to hear some feedback for my confusions.

Thanks. 

[1] https://github.com/apache/kylin/pull/23

Re: Difference between using ES and Kylin as real-time OLAP engine?

Posted by Li Yang <li...@apache.org>.
My personal opinion.

Choosing the right OLAP solution is not easy. There are different options
for different scales.

If it is millions of rows, then RDBMS like MySql / PostgreSQL shall fly.

If it is billions of rows, then Druid / ES / Kylin will all work given you
get the right hardware and software configuration.

If it is trillions of rows (or more), then Kylin has a big advantage thanks
to its precalculation system. Read more about precalculation here
<http://www.slideshare.net/YangLi43/apache-kylin-20-from-classic-olap-to-realtime-data-warehouse>
.


Cheers
Yang

On Fri, Apr 7, 2017 at 10:39 AM, lufeng <am...@163.com> wrote:

> Hi Luke
>
> I find a comparison between Druid  that can provide sub-second OLAP
> queries an and ES[1]. Maybe there are some identity of views between Kylin
> and ES.
>
> Yes, compare with HBase that ES has no standout features to invest to this
> path and Kylin do lots of internal optimization on HBase storage.
>
> I will glad to share the comparison to community when got some progress.
>
> Thanks for your reply.
>
>
> [1] http://druid.io/docs/latest/comparisons/druid-vs-elasticsearch.html
>
>
>
> 在 2017年4月6日,下午5:19,Luke Han <lu...@gmail.com> 写道:
>
> ES is not a target storage for Kylin so far, at least not on coming
> release plan.
>
> There are already many storage options in Hadoop Ecosystem, I don't think
> there's strong reason to invest on this path.
>
> And I don't remember there's any benchmark or comparison available today
> for your purpose.
> Please share with us if you have chance to do it:)
>
> Thanks.
>
>
> Best Regards!
> ---------------------
>
> Luke Han
>
> On Wed, Apr 5, 2017 at 6:17 PM, lufeng <am...@163.com> wrote:
>
>> Hi All
>>
>> Now if I want to build a OLAP platform to analyze our huge data, the
>> first impression is using Hive,but it can not match our real-time query
>> needs. Then I found lots of companies used ES to build their OLAP engine
>> before like tencent’s Hermes. So I want to know that  what’s the
>> difference between ES and Apache Kylin from different dimensions like
>> filter, aggregation queries or etc.
>>
>> * Performance
>> * Flexible queries
>> * Data Model
>> * BI integration
>> * …
>>
>> After Kylin release latest architecture that can support different
>> storage engine, I found that Kylin will support ES[1] as it’s backend
>> storage engine. So I think ES is ROLAP engine that lots of people use ES as
>> NoSQL DB and Kylin is a MOLAP engine.
>>
>> I would love to hear some feedback for my confusions.
>>
>> Thanks.
>>
>> [1] https://github.com/apache/kylin/pull/23
>>
>
>
>

Re: Difference between using ES and Kylin as real-time OLAP engine?

Posted by lufeng <am...@163.com>.
Hi Luke

I find a comparison between Druid  that can provide sub-second OLAP queries an and ES[1]. Maybe there are some identity of views between Kylin and ES. 

Yes, compare with HBase that ES has no standout features to invest to this path and Kylin do lots of internal optimization on HBase storage. 

I will glad to share the comparison to community when got some progress. 

Thanks for your reply.


[1] http://druid.io/docs/latest/comparisons/druid-vs-elasticsearch.html


> 在 2017年4月6日,下午5:19,Luke Han <lu...@gmail.com> 写道:
> 
> ES is not a target storage for Kylin so far, at least not on coming release plan.
> 
> There are already many storage options in Hadoop Ecosystem, I don't think there's strong reason to invest on this path.
> 
> And I don't remember there's any benchmark or comparison available today for your purpose. 
> Please share with us if you have chance to do it:)
> 
> Thanks.
> 
> 
> Best Regards!
> ---------------------
> 
> Luke Han
> 
> On Wed, Apr 5, 2017 at 6:17 PM, lufeng <amuseme@163.com <ma...@163.com>> wrote:
> Hi All
> 
> Now if I want to build a OLAP platform to analyze our huge data, the first impression is using Hive,but it can not match our real-time query needs. Then I found lots of companies used ES to build their OLAP engine before like tencent’s Hermes. So I want to know that  what’s the difference between ES and Apache Kylin from different dimensions like filter, aggregation queries or etc. 
> 
> * Performance 
> * Flexible queries
> * Data Model
> * BI integration 
> * …
> 
> After Kylin release latest architecture that can support different storage engine, I found that Kylin will support ES[1] as it’s backend storage engine. So I think ES is ROLAP engine that lots of people use ES as NoSQL DB and Kylin is a MOLAP engine. 
> 
> I would love to hear some feedback for my confusions.
> 
> Thanks. 
> 
> [1] https://github.com/apache/kylin/pull/23 <https://github.com/apache/kylin/pull/23>


Re: Difference between using ES and Kylin as real-time OLAP engine?

Posted by Luke Han <lu...@gmail.com>.
ES is not a target storage for Kylin so far, at least not on coming release
plan.

There are already many storage options in Hadoop Ecosystem, I don't think
there's strong reason to invest on this path.

And I don't remember there's any benchmark or comparison available today
for your purpose.
Please share with us if you have chance to do it:)

Thanks.


Best Regards!
---------------------

Luke Han

On Wed, Apr 5, 2017 at 6:17 PM, lufeng <am...@163.com> wrote:

> Hi All
>
> Now if I want to build a OLAP platform to analyze our huge data, the first
> impression is using Hive,but it can not match our real-time query needs.
> Then I found lots of companies used ES to build their OLAP engine before
> like tencent’s Hermes. So I want to know that  what’s the difference
> between ES and Apache Kylin from different dimensions like filter,
> aggregation queries or etc.
>
> * Performance
> * Flexible queries
> * Data Model
> * BI integration
> * …
>
> After Kylin release latest architecture that can support different storage
> engine, I found that Kylin will support ES[1] as it’s backend storage
> engine. So I think ES is ROLAP engine that lots of people use ES as NoSQL
> DB and Kylin is a MOLAP engine.
>
> I would love to hear some feedback for my confusions.
>
> Thanks.
>
> [1] https://github.com/apache/kylin/pull/23
>