You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lens.apache.org by "amareshwarisr ." <am...@gmail.com> on 2015/01/20 06:42:10 UTC

Re: Understanding commonalties between kylin and lens

Sorry for late reply. Somehow this landed in spam folder in my inbox.

Answers inline

On Fri, Dec 26, 2014 at 8:28 PM, Luke Han <lu...@apache.org> wrote:

> Hi Amareshwari,
>     Thanks for your mail to contact us. The open source world always
> connect each others for exchange ideas so easy, I really like it:-)
>
>
Very true :)

     As checked Lens' proposal (
> http://wiki.apache.org/incubator/LensProposal) and documentations from
> your site(
> http://svn.apache.org/repos/asf/incubator/lens/site/publish/current/index.html),
> including you listed in mail, I agree with you that we both are trying to
> resolve similar problem: Fast and Easy Analytics for Big Data.
>
> Yes. Main goal of the lens is to give single unified interface for the
user.  It achieves fast analytics through allowing the execution engines
which can provide that capability like Hive on Tez or Impala or Spark SQL
or Hive on HBase as the execution path to choose. Its getting build with
smartness to know which one to use at runtime without user specifying. On
those lines Kylin can be added as one more execution engine if we want to
use kylin in path.

Other goals of Lens include - provide enterprise rich api for the user for
Submitting a query/fetching results/send results in an email/ schedule a
query and etc.


>     The approach Kylin picked up so far is MOLAP which will pre-calculate
> and store result as OLAP cube (using HBase as storage now), then enable
> client to query data via ANSI SQL.
>

>
And Lens' approach (please correct me if I'm wrong) using is to define an
> abstract query layer for underline storage including HDFS, Hive and other
> RDBMs, then enable client to query via SQL-Like language to access data
> without knowing detail. I’m not sure is there any “calculation” part in
> Lens since I can’t found documentation about it. could you please let’s
> know if there’s some reference? So that we could understand more deep about
> Lens.
>
> Yes. Lens does not do any pre-calculation. But it can understand the
tables are pre-aggregated and available, so at query time, it takes the
decision to hit those tables. Basically Lens does not have any ETL getting
attached to it. At Inmobi, we use Apache Falcon for doing all the ETL
required.

>

>      We are welcome and open to discuss with everyone about possible
> collaboration, like we have claimed in our web site (
> http://www.kylin.io/assets/images/core.png), we would like to work with
> entire community to build an ecosystem around Kylin to offer more better
> analytics capability for big data. To extend Kylin's features, to integrate
> with others, to offer customized interface. also to adopt new concept in
> core module (for example, we have brought another storage and query
> mechanism called InvertedIndex in next release).
>       Looking forward for your idea about Lens and Kylin.
>
> Main area of collaboration I'm looking forward is to strengthen the OLAP
model on Big data storages, where both projects are building OLAP model on
Hadoop based systems and I see many things are getting repeated on both
systems.

You can find more on OLAP in Lens here -
http://lens.incubator.apache.org/user/olap-cube.html



> Thanks.
> Luke
>
>
> 2014-12-26 19:32 GMT+08:00 Amareshwari Sriramdasu <am...@apache.org>
> :
>
>> Hello Kylin developers,
>>
>> I'm a developer at Apache Lens (incubator.apache.org/projects/lens.html),
>> doc avaialble at
>>
>> http://svn.apache.org/repos/asf/incubator/lens/site/publish/current/index.html
>> ,
>> which tries solve similar problem as kylin wrt OLAP cubes. So, i'm sending
>> out this mail to understand commonalties and see if we can reuse and
>> collaborate on some.
>>
>> Lens is an analytics platform which tries to give the ability to create
>> OLAP cube on top HCatalog tables, supports multiple storages to be the
>> underlying storage for fact and dimension data like HDFS, HBase,
>> traditional DWH, with pluggable execution engines to read the underlying
>> data. Lens provides other services like query lifecycle manager (with
>> history, statistics) which will allow to know which are the frequently
>> queried columns so that aggregated facts can be created on them.
>>
>> You can see OLAP cube in lens here -
>>
>> http://svn.apache.org/repos/asf/incubator/lens/site/publish/current/user/olap-cube.html
>>
>> After going through the kylin docs(
>> http://www.slideshare.net/YangLi43/apache-kylin-deep-dive-2014-dec),  i
>> understand (correct me if i'm wrong) that
>> Kylin builds a cube by storing aggregated facts materialized in Hbase, it
>> constructs aggregated facts from tables in HDFS. It also provides ability
>> for administrator to define cube and cubeoids.
>>
>> I see the commonalties are mainly wrt OLAP cube definitions. The
>> differentiators are Kylin gives an execution engine for running a query on
>> cube, whereas Lens doesn't have any execution engine in itself.
>>
>> Let us know if the above details sound fine. If so, can look at what we
>> can
>> do next to understand more.
>>
>> Thanks
>> Amareshwari
>>
>
>