You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@kylin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/12/18 13:20:00 UTC

[jira] [Commented] (KYLIN-4185) CubeStatsReader estimate wrong cube size

    [ https://issues.apache.org/jira/browse/KYLIN-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999144#comment-16999144 ] 

ASF GitHub Bot commented on KYLIN-4185:
---------------------------------------

zhoukangcn commented on pull request #1005: KYLIN-4185: optimize CuboidSizeMap by using historical segments
URL: https://github.com/apache/kylin/pull/1005
 
 
   see: https://issues.apache.org/jira/browse/kylin-4185
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> CubeStatsReader estimate wrong cube size
> ----------------------------------------
>
>                 Key: KYLIN-4185
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4185
>             Project: Kylin
>          Issue Type: Improvement
>            Reporter: ZhouKang
>            Priority: Major
>
> CubeStatsReader estimate wrong cube size, which cause a lot of problems.
> when the estimated size is much larger than the real size, the spark application's executor number is small, and cube build step will take a long time. sometime the step will failed due to the large dataset.
> When the estimated size is much smaller than the real size. the cuboid file in HDFS is small, and there are much of cuboid file.
>  
> In our production environment, both the two situation happened.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)