You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kylin.apache.org by Sarnath <st...@gmail.com> on 2015/09/15 03:57:04 UTC

Feedback Please : Serving Aggregations through ElasticSearch

Hi Luke and other Seniors,

I come from Big Data Center of excellence from an Indian IT major.

We have been experimenting with the idea of serving cubes through
ElasticSearch REST API. This is not related to Kylin. This is our own
internal development.

But, I would like to hear some feedback from the designers, architects and
developers in this forum. Thanks in advance for your time.

The motivation for this is --- Once the cube is built, it needs to be
served.

The query looks somewhat like this:
"Given ProductID=*, Year=2015, Fetch All Quantities Sold"
"Given ProductID=XX, Fetch how much it has sold every Month"
Find all entries that match K1=V1, K2=V2

This relieves us from lot of things - storage, REST API etc. and makes the
cubes easily searchable.

However, we don't do SQL/MDX on top of it.  Tableau 9.1Beta is
experimenting with Web-Data-Connector which we believe can be used for
Visualization... Apart from that, we experimented with a few
auto-generated Kibana dashboards which were just okay. But Kibana was not
designed for Cubes and so it has its own limitations.

I would love to hear some feedback from the architects regarding the
Pros/Cons of this approach.

Thanks,
Best,
Sarnath

Re: Feedback Please : Serving Aggregations through ElasticSearch

Posted by Sarnath <st...@gmail.com>.

Hi All,
Thanks for all your time. It was very useful. At least, I am happy to know
that I am not in unchartered waters.

I would probably imagine writing a calcite wrapper on top of REST API to
expose cubes via SQL or probably MDX. Calcite will help doing that, right?
I heard of calcite only through this form.

Thanks for all your time,
Appreciate it much!
Best,
Sarnath

Re: Feedback Please : Serving Aggregations through ElasticSearch

Posted by Luke Han <lu...@gmail.com>.

Hi Sarnath,
    If you are building specified purpose application, inverted index may a
good option for that, limited function/feature but easy to implement and
optimization.
    if you are building a generic purpose application, then query
interface, flexible data model management, seamless integration with BI
tool... will become more important.

    Here in eBay, there's II based OLAP solution (not Kylin one) but just
using HBase as storage like your idea:-)

    Thanks.

Luke




Best Regards!
---------------------

Luke Han

On Wed, Sep 16, 2015 at 10:40 PM, Adunuthula, Seshu <sa...@ebay.com>
wrote:

> Sarnath,
>
> Using an inverted index for Big Data Analytics is a well accepted pattern
> and the best implementation (IMO) out there is Oracle BDD.
> https://www.oracle.com/big-data/big-data-discovery/index.html
>
> As Yang has pointed out when the sophistication of SQL is not required
> this is an ideal approach.
>
> Regards
> Seshu Adunuthula
>
>
> On 9/16/15, 12:43 AM, "Li Yang" <li...@apache.org> wrote:
>
> >Inverted index of pre-aggregates works when the query pattern is
> >determined
> >and limited, like your case.
> >
> >Later, if you want to support flexible queries by possibly any combination
> >of columns, then a more general MOLAP engine like Kylin may come into
> >sight.
> >
> >SQL interface is again for flexible queries. In addition it allows
> >integration with any BI tools that extracts from SQL, not just Tableau.
> >
> >Cheers
> >Yang
> >
> >On Tue, Sep 15, 2015 at 9:57 AM, Sarnath <st...@gmail.com> wrote:
> >
> >> Hi Luke and other Seniors,
> >>
> >> I come from Big Data Center of excellence from an Indian IT major.
> >>
> >> We have been experimenting with the idea of serving cubes through
> >> ElasticSearch REST API. This is not related to Kylin. This is our own
> >> internal development.
> >>
> >> But, I would like to hear some feedback from the designers, architects
> >>and
> >> developers in this forum. Thanks in advance for your time.
> >>
> >> The motivation for this is --- Once the cube is built, it needs to be
> >> served.
> >>
> >> The query looks somewhat like this:
> >> "Given ProductID=*, Year=2015, Fetch All Quantities Sold"
> >> "Given ProductID=XX, Fetch how much it has sold every Month"
> >> Find all entries that match K1=V1, K2=V2
> >>
> >> This relieves us from lot of things - storage, REST API etc. and makes
> >>the
> >> cubes easily searchable.
> >>
> >> However, we don't do SQL/MDX on top of it.  Tableau 9.1Beta is
> >> experimenting with Web-Data-Connector which we believe can be used for
> >> Visualization... Apart from that, we experimented with a few
> >> auto-generated Kibana dashboards which were just okay. But Kibana was
> >>not
> >> designed for Cubes and so it has its own limitations.
> >>
> >> I would love to hear some feedback from the architects regarding the
> >> Pros/Cons of this approach.
> >>
> >> Thanks,
> >> Best,
> >> Sarnath
> >>
>
>

Re: Feedback Please : Serving Aggregations through ElasticSearch

Posted by "Adunuthula, Seshu" <sa...@ebay.com>.

Sarnath,

Using an inverted index for Big Data Analytics is a well accepted pattern
and the best implementation (IMO) out there is Oracle BDD.
https://www.oracle.com/big-data/big-data-discovery/index.html

As Yang has pointed out when the sophistication of SQL is not required
this is an ideal approach.

Regards
Seshu Adunuthula


On 9/16/15, 12:43 AM, "Li Yang" <li...@apache.org> wrote:

>Inverted index of pre-aggregates works when the query pattern is
>determined
>and limited, like your case.
>
>Later, if you want to support flexible queries by possibly any combination
>of columns, then a more general MOLAP engine like Kylin may come into
>sight.
>
>SQL interface is again for flexible queries. In addition it allows
>integration with any BI tools that extracts from SQL, not just Tableau.
>
>Cheers
>Yang
>
>On Tue, Sep 15, 2015 at 9:57 AM, Sarnath <st...@gmail.com> wrote:
>
>> Hi Luke and other Seniors,
>>
>> I come from Big Data Center of excellence from an Indian IT major.
>>
>> We have been experimenting with the idea of serving cubes through
>> ElasticSearch REST API. This is not related to Kylin. This is our own
>> internal development.
>>
>> But, I would like to hear some feedback from the designers, architects
>>and
>> developers in this forum. Thanks in advance for your time.
>>
>> The motivation for this is --- Once the cube is built, it needs to be
>> served.
>>
>> The query looks somewhat like this:
>> "Given ProductID=*, Year=2015, Fetch All Quantities Sold"
>> "Given ProductID=XX, Fetch how much it has sold every Month"
>> Find all entries that match K1=V1, K2=V2
>>
>> This relieves us from lot of things - storage, REST API etc. and makes
>>the
>> cubes easily searchable.
>>
>> However, we don't do SQL/MDX on top of it.  Tableau 9.1Beta is
>> experimenting with Web-Data-Connector which we believe can be used for
>> Visualization... Apart from that, we experimented with a few
>> auto-generated Kibana dashboards which were just okay. But Kibana was
>>not
>> designed for Cubes and so it has its own limitations.
>>
>> I would love to hear some feedback from the architects regarding the
>> Pros/Cons of this approach.
>>
>> Thanks,
>> Best,
>> Sarnath
>>

Re: Feedback Please : Serving Aggregations through ElasticSearch

Posted by Li Yang <li...@apache.org>.

Inverted index of pre-aggregates works when the query pattern is determined
and limited, like your case.

Later, if you want to support flexible queries by possibly any combination
of columns, then a more general MOLAP engine like Kylin may come into sight.

SQL interface is again for flexible queries. In addition it allows
integration with any BI tools that extracts from SQL, not just Tableau.

Cheers
Yang

On Tue, Sep 15, 2015 at 9:57 AM, Sarnath <st...@gmail.com> wrote:

> Hi Luke and other Seniors,
>
> I come from Big Data Center of excellence from an Indian IT major.
>
> We have been experimenting with the idea of serving cubes through
> ElasticSearch REST API. This is not related to Kylin. This is our own
> internal development.
>
> But, I would like to hear some feedback from the designers, architects and
> developers in this forum. Thanks in advance for your time.
>
> The motivation for this is --- Once the cube is built, it needs to be
> served.
>
> The query looks somewhat like this:
> "Given ProductID=*, Year=2015, Fetch All Quantities Sold"
> "Given ProductID=XX, Fetch how much it has sold every Month"
> Find all entries that match K1=V1, K2=V2
>
> This relieves us from lot of things - storage, REST API etc. and makes the
> cubes easily searchable.
>
> However, we don't do SQL/MDX on top of it.  Tableau 9.1Beta is
> experimenting with Web-Data-Connector which we believe can be used for
> Visualization... Apart from that, we experimented with a few
> auto-generated Kibana dashboards which were just okay. But Kibana was not
> designed for Cubes and so it has its own limitations.
>
> I would love to hear some feedback from the architects regarding the
> Pros/Cons of this approach.
>
> Thanks,
> Best,
> Sarnath
>