You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Yerui Sun (JIRA)" <ji...@apache.org> on 2016/07/06 03:25:11 UTC
[jira] [Resolved] (KYLIN-1379) More stable and functional precise
count distinct implements after KYLIN-1186
[ https://issues.apache.org/jira/browse/KYLIN-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yerui Sun resolved KYLIN-1379.
------------------------------
Resolution: Fixed
Fix Version/s: v1.5.3
> More stable and functional precise count distinct implements after KYLIN-1186
> -----------------------------------------------------------------------------
>
> Key: KYLIN-1379
> URL: https://issues.apache.org/jira/browse/KYLIN-1379
> Project: Kylin
> Issue Type: Improvement
> Components: Job Engine
> Affects Versions: v1.5.0, v1.3.0
> Reporter: Yerui Sun
> Assignee: Yerui Sun
> Fix For: v1.5.3
>
>
> After KYLIN-1186, we've gained the ability to count distinct Int type columns precisely.
> However, the implements of KYLIN-1186 is not stable, especially in 2.x-staging branch.
> The reason is that the measure's maxlength is used to allocate memory in 2.x version, and the BitmapMeasure is hardcoded to 8MB in KYLIN-1186, causing OOM when cube building.
> To resolve this problem, we have introduce precision on the bitmap measure, such as bitmap(100), bitmap(10000), bitmap(1000000), meaning the measure could accept 100/10000/1M cardinality at most. This solution should be fine, considering the reality, if the count value over 1000000, the hyperloglog measure which produce approx. result should be acceptable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)