You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Chao Long (JIRA)" <ji...@apache.org> on 2018/07/24 10:12:00 UTC

[jira] [Comment Edited] (KYLIN-3453) Improve cube size estimation for TOPN, COUNT DISTINCT

    [ https://issues.apache.org/jira/browse/KYLIN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553990#comment-16553990 ] 

Chao Long edited comment on KYLIN-3453 at 7/24/18 10:11 AM:
------------------------------------------------------------

I have done some tests on the enhancement, and the results are as belows:

1, test sample cube

     cube real size: 

     !image-2018-07-24-16-30-50-804.png!      

    before enhancement

      !image-2018-07-24-16-33-43-231.png!

    after enhancement

     !image-2018-07-24-16-29-07-359.png!      

 

2,ssb dataSet

   cube real size

     !image-2018-07-24-16-37-09-199.png!

   before enhancement

     !image-2018-07-24-17-11-27-829.png!      

   after enhancement

     !image-2018-07-24-17-12-25-880.png!

And we can find that the estimatation of Cube size is more accurate after enhancement.


was (Author: wayne0101):
After the enhancement, I did some tests.

1, test sample cube

     cube real size: 

    !image-2018-07-24-16-30-50-804.png!      

    before enhancement

     !image-2018-07-24-16-33-43-231.png!

    after enhancement

     !image-2018-07-24-16-29-07-359.png!      

 

2,ssb dataSet

   cube real size

    !image-2018-07-24-16-37-09-199.png!

   before enhancement

    !image-2018-07-24-17-11-27-829.png!      

   after enhancement

    !image-2018-07-24-17-12-25-880.png!

from the test result, we can find that after the enhancement, the estimates of cube size are more accurate.

> Improve cube size estimation for TOPN, COUNT DISTINCT
> -----------------------------------------------------
>
>                 Key: KYLIN-3453
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3453
>             Project: Kylin
>          Issue Type: Improvement
>            Reporter: Chao Long
>            Assignee: Chao Long
>            Priority: Major
>             Fix For: v2.5.0
>
>         Attachments: image-2018-07-24-16-29-07-359.png, image-2018-07-24-16-30-50-804.png, image-2018-07-24-16-33-43-231.png, image-2018-07-24-16-37-09-199.png, image-2018-07-24-17-11-26-283.png, image-2018-07-24-17-11-27-829.png, image-2018-07-24-17-12-25-880.png
>
>
> Currently, Kylin has poor cube size estimation for TOPN, COUNT DISTINCT. We should improve it, then we can get a reasonable split num when cube building. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)