You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Zhong Yanghong (JIRA)" <ji...@apache.org> on 2017/07/19 08:04:00 UTC

[jira] [Comment Edited] (KYLIN-2727) Introduce cube planner able to select cost-effective cuboids to be built by cost-based algorithms

    [ https://issues.apache.org/jira/browse/KYLIN-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092741#comment-16092741 ] 

Zhong Yanghong edited comment on KYLIN-2727 at 7/19/17 8:03 AM:
----------------------------------------------------------------

bq. How the new planner judge the cost and benefit of a cuboid? Based on past usage or something else? Is there any document you could share?

The cube planner is based on both *past usage* & *rollup cost from target cuboid to source cuboid*. [A referenced paper|http://ilpubs.stanford.edu:8090/102/1/1995-34.pdf].

bq. A new cuboid scheduler is a big change. To manage the risk of such change, we usually introduce a transition period, where the new and the old coexist. The coexistence of MR engine and Spark engine is an example. We should do the sample for the scheduler.

The new cuboid scheduler will coexist will the current one. Not all of the cubes will use cube planner. There are two cases that cube planner will not be used.
* Incremental building for existing cubes having ready segments
* The total number of cuboids calculated from static rule is less than a threshold (1024 in eBay usage)

These cubes, not using cube planner, will still using previous cuboid scheduler.


was (Author: yaho):
bq. How the new planner judge the cost and benefit of a cuboid? Based on past usage or something else? Is there any document you could share?

The cube planner is based on both *past usage* & *rollup cost from target cuboid to source cuboid*. [link A referenced paper|http://ilpubs.stanford.edu:8090/102/1/1995-34.pdf].

bq. A new cuboid scheduler is a big change. To manage the risk of such change, we usually introduce a transition period, where the new and the old coexist. The coexistence of MR engine and Spark engine is an example. We should do the sample for the scheduler.

The new cuboid scheduler will coexist will the current one. Not all of the cubes will use cube planner. There are two cases that cube planner will not be used.
* Incremental building for existing cubes having ready segments
* The total number of cuboids calculated from static rule is less than a threshold (1024 in eBay usage)

These cubes, not using cube planner, will still using previous cuboid scheduler.

> Introduce cube planner able to select cost-effective cuboids to be built by cost-based algorithms
> -------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-2727
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2727
>             Project: Kylin
>          Issue Type: New Feature
>    Affects Versions: v2.0.0
>            Reporter: Zhong Yanghong
>            Assignee: Zhong Yanghong
>
> There're several disadvantages to create partial cubes only based on static rules:
> * To learn the concept of these rules will bring extra burden to cube admins
> * To achieve a goal cuboid set may need a very complicated set of static rules
> * Cube designers may be not familiar with business related query patterns, resulting in few static rules able to applied
> * Static rules created at first may not correct or user query patterns are changing dynamically
> The goal of cube planner is to reduce effort for cube admins to design an effective partial cube. It owns the following advantages:
> * No sophisticated combination of static rules is needed
> * Allow more dimensions to be included with few static rules
> * Able to adjust a non-cost-effective cuboid set to a cost-effective one based on historical queries



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)