You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "zhao jintao (JIRA)" <ji...@apache.org> on 2019/04/17 14:26:00 UTC
[jira] [Updated] (KYLIN-3961) Optimize TopN measure merge
function to reduce topNCounter errors
[ https://issues.apache.org/jira/browse/KYLIN-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhao jintao updated KYLIN-3961:
-------------------------------
Summary: Optimize TopN measure merge function to reduce topNCounter errors (was: Optimize TopN measure merge function to reduce mistaks )
> Optimize TopN measure merge function to reduce topNCounter errors
> ---------------------------------------------------------------------
>
> Key: KYLIN-3961
> URL: https://issues.apache.org/jira/browse/KYLIN-3961
> Project: Kylin
> Issue Type: Improvement
> Components: Measure - TopN
> Affects Versions: v2.5.2
> Environment: Huawei FusionInsight
> Reporter: zhao jintao
> Assignee: zhao jintao
> Priority: Major
> Labels: easyfix
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> Hi Team:
> I use "Top-N "measure to query such sql "select sum(AAA) from BBB group by CCC,DDD", It is much better than a cube without "Top-N".
> In my system, kylin cost just 0.2s to query sql with "Top-N" measure cube; If without "Top-N" measure it may be cost 10s.
> But I find that Top-N measure can be optimized to reduce mistaks.
> I use kylin demo to test "TopN".
> I build two cube using "KYLIN_SALES". The first cube has three dimentions:"SELLER_ID","BUYER_ID" and "PART_DT", has one measures: "SUM(PRICE)" . The second cube has one dimention:"PART_DT", has twon measures: "SUM(PRICE)" and "TOPN(10)", the "ORDER|SUM by Column" of "TOPN(10)" is "PRICE", the "Group by Column" of “TOPN(10)” is "SELLER_ID" and "BUYER_ID",the "Return Type" of "TOPN(10)" is "Top 10". Then I build cube from "2012-01-01" to "2014-01-01".
> I use same sql to query two cube. I find that 2 cubes have a larger error.
> The top5 "SUM PRICE" of first cube without "TopN" is "167.7269", "99.9908", "99.9888","99.9865","99.978".
> The top5 "SUM PRICE" of second cube with "TopN" is "179.27699...","167.6320...","167.3050...","167.2069...","166.7429...".
> Does any one meet same problem?
>
> Best regards.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)