You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Zhong Yanghong (JIRA)" <ji...@apache.org> on 2016/09/08 10:19:20 UTC

[jira] [Commented] (KYLIN-1960) Provide a new build type called COVER

    [ https://issues.apache.org/jira/browse/KYLIN-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15473475#comment-15473475 ] 

Zhong Yanghong commented on KYLIN-1960:
---------------------------------------

It seems that there's a limitation for the current CubeBuildTypeEnum.BUILD. Suppose users have a cube with PartitionDateStart to be 0. They want to build a segment from PartitionDateStart to t2. However, by default kylin regards whether the parameter startDate to be 0 as whether to build an appending segment. Then the resulted behavior is not expected. 

Above all, it's better to import an additional parameter to indicate whether it's append or ordinary build to make it compatible with previous version.

I'm wondering why we don't create a cube build type, APPEND, at first?

> Provide a new build type called COVER
> -------------------------------------
>
>                 Key: KYLIN-1960
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1960
>             Project: Kylin
>          Issue Type: New Feature
>            Reporter: Zhong Yanghong
>            Assignee: Zhong Yanghong
>         Attachments: provide_a_new_build_type_COVER.patch
>
>
> The current three build types are not good at dealing with incremental building with refreshing old data. 
> For example, there are [S1, S2, S3] segments in a cube. S1 with time range [t1, t2); S2 with time range [t2, t3); S3 with time range [t3, t4). 
> Now users want to refresh the old data within [t2, t4). The current strategy is to merge S2 and S3 firstly. Then refresh the bigger segment with time range [t2, t4). The first step is meaningless and wasteful. 
> What if users want to build a new segment with time range [t2, t5), where t5 is larger than t4. The current strategy is clumsy.
> How about providing a new build type? Here, it's called COVER. For this type, users also should provide start time and end time as parameters. The start time should match the boundaries of segments. In this case, it should be in set {t1, t2, t3, t4}. While the end time should be larger than t4, or match one of the end times of existing segments. In this case, it should be in set {t2, t3, t4}. Of course, the end time should be larger than the start time. 
> What job engine will do with this build type is as follows:
> 1. first build a new segment with the time range;
> 2. then the covered segments will be deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)