You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Yifei Wu (JIRA)" <ji...@apache.org> on 2017/12/02 15:59:00 UTC
[jira] [Commented] (KYLIN-3078) the estimated size of percentile
measure is too big
[ https://issues.apache.org/jira/browse/KYLIN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275628#comment-16275628 ]
Yifei Wu commented on KYLIN-3078:
---------------------------------
the key is to clarify the percentile impact on cube size estimate and find a more proper way to estimate the size of percentile measure.
For the measure use the T-digest Algorithm to realize it, so it can conclude some regular pattern by the analysis from the T-digest paper and the statistics collected in the local test.
> the estimated size of percentile measure is too big
> ----------------------------------------------------
>
> Key: KYLIN-3078
> URL: https://issues.apache.org/jira/browse/KYLIN-3078
> Project: Kylin
> Issue Type: Bug
> Reporter: Yifei Wu
> Assignee: Yifei Wu
> Priority: Critical
>
> To set a shard number that will be for controlling the size per shard properly, we need to estimate cube size through accumulating all dimension and measure size roughly before building a cube. But the way of calculating the percentile measure is inaccurate currently and cause too many partitions for cube storage. Furthermore, it may affect the performance of SQL query.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)