You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "shengxiaopeng (JIRA)" <ji...@apache.org> on 2018/11/26 08:10:00 UTC
[jira] [Updated] (KYLIN-3703) get negative value when query kylin
[ https://issues.apache.org/jira/browse/KYLIN-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
shengxiaopeng updated KYLIN-3703:
---------------------------------
Description:
version: Kylin 2.5.1
There is a cube built in real time, one metric is average time consuming, *the source data does not have a negative value, but the negative value occurs by hourly aggregation*, and the aggregation is normal by minute
*1. Metric definition*
|Name|Expression|Parameters|Return Type|
|SUM_FS|SUM|Value:APM_FIRSTSCREEN_WEB.PRF_PL_FS, Type:column|bigint|
*2.Aggregate sql and results by hour,then negative measure value occurs*
```
select
hour_start
,count(*) as amount
,sum(prf_pl_fs) as sum_pl_fs
from DY_APM.APM_FIRSTSCREEN_WEB
where 1=1 and minute_start >= '2018-11-23 20:00:00' and minute_start < '2018-11-23 21:00:00'
group by hour_start
order by hour_start asc
limit 15000
```
|hour_start|amount|sum_pl_fs|
|2018-11-23 20:00:00|974466|-1317398890|
*3.Aggregate sql and results by minute*
```
select
minute_start
,count(*) as amount
,sum(prf_pl_fs) as sum_pl_fs
from DY_APM.APM_FIRSTSCREEN_WEB
where 1=1 and minute_start >= '2018-11-23 20:00:00' and minute_start < '2018-11-23 21:00:00'
group by minute_start
order by minute_start asc
limit 15000
```
|hour_start|amount|sum_pl_fs|
|2018-11-23 20:21:00|2629|8107124|
|2018-11-23 20:22:00|12216|36558869|
|2018-11-23 20:23:00|12800|38401450|
|...|略|...|
4. I guess this should be caused by crossing the border.
I use the data of each minute to get the hourly data, and the total time is *2977568406*, which is far from the scope of bigint.
```
select
sum(t.amount) amount
,sum(t.sum_pl_fs) sum_pl_fs
,sum(t.sum_pl_fs) / sum(t.amount) as avg_pl_fs
from
(
select
minute_start
,count(*) as amount
,sum(prf_pl_fs) as sum_pl_fs
from DY_APM.APM_FIRSTSCREEN_WEB
where 1=1 and minute_start >= '2018-11-23 20:00:00' and minute_start < '2018-11-23 21:00:00'
group by minute_start
order by minute_start asc
limit 15000
) as t
```
|amount|sum_pl_fs|avg_pl_fs|
|974466|2977568406|3055|
*Excuse me, is this question a problem with my configuration or query? How should it be solved?*
was:
version: Kylin 2.5.1
There is a cube built in real time, one metric is average time consuming, the source data does not have a negative value, but the negative value occurs by hourly aggregation, and the aggregation is normal by minute
Metric definition
|Name|Expression|Parameters|Return Type|
|SUM_FS|SUM|Value:APM_FIRSTSCREEN_WEB.PRF_PL_FS, Type:column|bigint|
#
## Aggregate sql and results by hour
```
select
hour_start
,count(*) as amount
,sum(prf_pl_fs) as sum_pl_fs
from DY_APM.APM_FIRSTSCREEN_WEB
where 1=1 and minute_start >= '2018-11-23 20:00:00' and minute_start < '2018-11-23 21:00:00'
group by hour_start
order by hour_start asc
limit 15000
```
|hour_start|amount|sum_pl_fs|
|2018-11-23 20:00:00|974466|-1317398890|
#
## Aggregate sql and results by minute
```
select
minute_start
,count(*) as amount
,sum(prf_pl_fs) as sum_pl_fs
from DY_APM.APM_FIRSTSCREEN_WEB
where 1=1 and minute_start >= '2018-11-23 20:00:00' and minute_start < '2018-11-23 21:00:00'
group by minute_start
order by minute_start asc
limit 15000
```
|hour_start|amount|sum_pl_fs|
|2018-11-23 20:21:00|2629|8107124|
|2018-11-23 20:22:00|12216|36558869|
|2018-11-23 20:23:00|12800|38401450|
|...|略|...|
I guess this should be caused by crossing the border. # I use the data of each minute to get the hourly data, and the total time is 2977568406, which is far from the scope of bigint.
```
select
sum(t.amount) amount
,sum(t.sum_pl_fs) sum_pl_fs
,sum(t.sum_pl_fs) / sum(t.amount) as avg_pl_fs
from
(
select
minute_start
,count(*) as amount
,sum(prf_pl_fs) as sum_pl_fs
from DY_APM.APM_FIRSTSCREEN_WEB
where 1=1 and minute_start >= '2018-11-23 20:00:00' and minute_start < '2018-11-23 21:00:00'
group by minute_start
order by minute_start asc
limit 15000
) as t
```
|amount|sum_pl_fs|avg_pl_fs|
|974466|2977568406|3055|
Excuse me, is this question a problem with my configuration or query? How should it be solved?
> get negative value when query kylin
> ------------------------------------
>
> Key: KYLIN-3703
> URL: https://issues.apache.org/jira/browse/KYLIN-3703
> Project: Kylin
> Issue Type: Bug
> Reporter: shengxiaopeng
> Priority: Minor
> Fix For: v2.5.1
>
>
> version: Kylin 2.5.1
> There is a cube built in real time, one metric is average time consuming, *the source data does not have a negative value, but the negative value occurs by hourly aggregation*, and the aggregation is normal by minute
>
> *1. Metric definition*
> |Name|Expression|Parameters|Return Type|
> |SUM_FS|SUM|Value:APM_FIRSTSCREEN_WEB.PRF_PL_FS, Type:column|bigint|
>
> *2.Aggregate sql and results by hour,then negative measure value occurs*
> ```
> select
> hour_start
> ,count(*) as amount
> ,sum(prf_pl_fs) as sum_pl_fs
> from DY_APM.APM_FIRSTSCREEN_WEB
> where 1=1 and minute_start >= '2018-11-23 20:00:00' and minute_start < '2018-11-23 21:00:00'
> group by hour_start
> order by hour_start asc
> limit 15000
> ```
>
> |hour_start|amount|sum_pl_fs|
> |2018-11-23 20:00:00|974466|-1317398890|
> *3.Aggregate sql and results by minute*
> ```
> select
> minute_start
> ,count(*) as amount
> ,sum(prf_pl_fs) as sum_pl_fs
> from DY_APM.APM_FIRSTSCREEN_WEB
> where 1=1 and minute_start >= '2018-11-23 20:00:00' and minute_start < '2018-11-23 21:00:00'
> group by minute_start
> order by minute_start asc
> limit 15000
> ```
> |hour_start|amount|sum_pl_fs|
> |2018-11-23 20:21:00|2629|8107124|
> |2018-11-23 20:22:00|12216|36558869|
> |2018-11-23 20:23:00|12800|38401450|
> |...|略|...|
>
> 4. I guess this should be caused by crossing the border.
> I use the data of each minute to get the hourly data, and the total time is *2977568406*, which is far from the scope of bigint.
> ```
> select
> sum(t.amount) amount
> ,sum(t.sum_pl_fs) sum_pl_fs
> ,sum(t.sum_pl_fs) / sum(t.amount) as avg_pl_fs
> from
> (
> select
> minute_start
> ,count(*) as amount
> ,sum(prf_pl_fs) as sum_pl_fs
> from DY_APM.APM_FIRSTSCREEN_WEB
> where 1=1 and minute_start >= '2018-11-23 20:00:00' and minute_start < '2018-11-23 21:00:00'
> group by minute_start
> order by minute_start asc
> limit 15000
> ) as t
> ```
> |amount|sum_pl_fs|avg_pl_fs|
> |974466|2977568406|3055|
>
> *Excuse me, is this question a problem with my configuration or query? How should it be solved?*
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)