You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kylin.apache.org by Luke Han <lu...@gmail.com> on 2015/02/05 07:50:31 UTC

Fwd: Proposal - Kylin stream cube builder

Forward to mailing list for further discussion.

在 2014年12月30日星期二 UTC+8下午6:30:17，Li Yang写道：
>
> What Xu described is the ultimate goal in this direction -- realtime!
>
> For this feature, it is less in scope by focusing on the micro segment of 
> cube building. The goal is to reduce cube data delay to be within one hour.
>
> Realtime is on the long term roadmap of course!  We will come to it when 
> inverted index is mature.
>
> Cheers
> Yang
>
>
> 在 2014年12月25日星期四UTC+8下午1时14分58秒，Luke Han写道：
>>
>> Github Issues for tracking this new feature: 
>> https://github.com/KylinOLAP/Kylin/issues/262
>>
>>
>>
>> 在 2014年12月25日星期四UTC+8上午7时11分49秒，Branky Shao写道：
>>>
>>> Xu, thanks for your comment. For our internal use cases like Nous, the 
>>> query pattern is determinate. Defining "popular path" can be treated as an 
>>> additional step in cube modeling and only queries on "popular path" are 
>>> most optimized.  
>>> On Tuesday, December 23, 2014 6:37:23 PM UTC-8, Jiang Xu wrote:
>>>>
>>>> "popular path" is a kind of partial cube materialization that is highly 
>>>> depended on the query pattern. Since we can't predict the user's query 
>>>> pattern, I have some concerns about the query performance.
>>>>
>>>> I think that the more important thing is to bring real-time capability 
>>>> into Kylin. It includes 3 things:
>>>>
>>>> 1. real time build in-memory storage. 
>>>> I suggest to build bitmap index instead of cube in memory. Kafka & 
>>>> Storm is a good option. 
>>>>
>>>> 2. real-time query in-memory storage. 
>>>> We need a distributed execution engine for real-time query. I have some 
>>>> concerns about the RPC mechanism of Storm.
>>>>
>>>> 3. dump in-memory storage into hadoop.
>>>> Storm is a good opinion for it. 
>>>>
>>>>
>>>>

Re: Proposal - Kylin stream cube builder

Posted by Li Yang <li...@apache.org>.

I'm wrapping up designs on streaming cubing and inverted-index. Will
publish here soon.

On Thu, Feb 5, 2015 at 2:50 PM, Luke Han <lu...@gmail.com> wrote:

> Forward to mailing list for further discussion.
>
> 在 2014年12月30日星期二 UTC+8下午6:30:17，Li Yang写道：
>>
>> What Xu described is the ultimate goal in this direction -- realtime!
>>
>> For this feature, it is less in scope by focusing on the micro segment of
>> cube building. The goal is to reduce cube data delay to be within one hour.
>>
>> Realtime is on the long term roadmap of course!  We will come to it when
>> inverted index is mature.
>>
>> Cheers
>> Yang
>>
>>
>> 在 2014年12月25日星期四UTC+8下午1时14分58秒，Luke Han写道：
>>>
>>> Github Issues for tracking this new feature: https://github.com/
>>> KylinOLAP/Kylin/issues/262
>>>
>>>
>>>
>>> 在 2014年12月25日星期四UTC+8上午7时11分49秒，Branky Shao写道：
>>>>
>>>> Xu, thanks for your comment. For our internal use cases like Nous, the
>>>> query pattern is determinate. Defining "popular path" can be treated as an
>>>> additional step in cube modeling and only queries on "popular path" are
>>>> most optimized.
>>>> On Tuesday, December 23, 2014 6:37:23 PM UTC-8, Jiang Xu wrote:
>>>>>
>>>>> "popular path" is a kind of partial cube materialization that is
>>>>> highly depended on the query pattern. Since we can't predict the user's
>>>>> query pattern, I have some concerns about the query performance.
>>>>>
>>>>> I think that the more important thing is to bring real-time capability
>>>>> into Kylin. It includes 3 things:
>>>>>
>>>>> 1. real time build in-memory storage.
>>>>> I suggest to build bitmap index instead of cube in memory. Kafka &
>>>>> Storm is a good option.
>>>>>
>>>>> 2. real-time query in-memory storage.
>>>>> We need a distributed execution engine for real-time query. I have
>>>>> some concerns about the RPC mechanism of Storm.
>>>>>
>>>>> 3. dump in-memory storage into hadoop.
>>>>> Storm is a good opinion for it.
>>>>>
>>>>>
>>>>>