You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "XiaoXiang Yu (JIRA)" <ji...@apache.org> on 2019/04/18 11:45:00 UTC

[jira] [Comment Edited] (KYLIN-3962) Support streaming cubing using Spark Streaming or Flink

    [ https://issues.apache.org/jira/browse/KYLIN-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16820980#comment-16820980 ] 

XiaoXiang Yu edited comment on KYLIN-3962 at 4/18/19 11:44 AM:
---------------------------------------------------------------

If I use Flink streaming to ingest streaming message, and write cuboid data(I think this maybe _+<Dimension Array, MeasureAggregator>+_) to external storage(like hbase or redis), I think it will have some drawback:
 - Using remote storage instead of local storage will increase the data prepare delay.
 - Introduce external dependency like flink cluster.
 - Too heavy pressure on external storage(maybe every entered message will cause a r/w to storage layer).
 - If we decide to use more cuboid, it will cause too much r/w to storage. If we decide to use more cuboid, most query will hit base cuboid, filter and aggregate will be slower if data in remote (we can use memory cache in receiver).


was (Author: hit_lacus):
If I use Flink streaming to ingest streaming message, and write cuboid data(I think this maybe <Dimension Array, MeasureAggregator>) to external storage(like hbase or redis), I think it will have some drawback:
- Using remote storage instead of local storage is will increase the data prepare delay.
- Introduce external dependency like flink cluster.
- Too heavy pressure on external storage.
- Filter and aggregate will be slower if data in remote.

> Support streaming cubing using Spark Streaming or Flink
> -------------------------------------------------------
>
>                 Key: KYLIN-3962
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3962
>             Project: Kylin
>          Issue Type: Improvement
>            Reporter: Liu Shaohui
>            Priority: Major
>
> KYLIN-3654 introduced the Real-time Streaming, but in my opinion, the arch is a little too complicated to handle.
> As streaming frameworks like spark streaming, flink are widely used in many companies.Can we use the streaming framework to support real time cubing in Kylin.
> This is just a proposal. More discussion and suggestions are welcomed~
> More details of this proposal will be added later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)