You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Lianggang (JIRA)" <ji...@apache.org> on 2018/12/20 10:12:00 UTC

[jira] [Created] (KYLIN-3733) kylin can't response correct data when using "in" filter

Lianggang created KYLIN-3733:
--------------------------------

             Summary: kylin can't response correct data when using "in" filter
                 Key: KYLIN-3733
                 URL: https://issues.apache.org/jira/browse/KYLIN-3733
             Project: Kylin
          Issue Type: Bug
    Affects Versions: v2.5.0, v2.3.2
            Reporter: Lianggang
         Attachments: demo.txt

 Hi guys,

I used the attachment data to create one cube in kylin. I encounter a strange issue.

when I use sql_1 to query, the data is correct.

sql_1:

_select exp_id,os,new_user_flag_
_,sum(sample_size)_
_,sum(sum_x)_
_,sum(sum_square_x)_
_,sum(act_dev)_ 
_from RPT_QUKAN.ABTEST_GRAPH_DETAIL_CUBE_MIDU_VIEW5_
_where refresh_dt ='2018-12-13'_ 
_and exp_id *in* ('8382')_
_group by exp_id,os,new_user_flag;_

 

when I use sql_2 to query, the data is *not* correct.

sql_2:

_select exp_id,os,new_user_flag_
_,sum(sample_size)_
_,sum(sum_x)_
_,sum(sum_square_x)_
_,sum(act_dev)_ 
_from RPT_QUKAN.ABTEST_GRAPH_DETAIL_CUBE_MIDU_VIEW5_
_where refresh_dt ='2018-12-13'_ 
_and exp_id *in* ('8382','8383')_
_group by exp_id,os,new_user_flag;_

 

when I add trim() for exp_id, the data is correct. like sql_3:

_select exp_id,os,new_user_flag_
_,sum(sample_size)_
_,sum(sum_x)_
_,sum(sum_square_x)_
_,sum(act_dev)_ 
_from RPT_QUKAN.ABTEST_GRAPH_DETAIL_CUBE_MIDU_VIEW5_
_where refresh_dt ='2018-12-13'_ 
_and *trim*(exp_id) in ('8382','8383')_
_group by exp_id,os,new_user_flag;_

I am not sure whether the rowkey is too long for this result. please help check the issue. thanks!

the table structure like below:

CREATE TABLE `RPT_QUKAN.ABTEST_GRAPH_DETAIL_CUBE_MIDU_VIEW5` (
`exp_id` string
,`new_user_flag` string
,`os` string
,`sample_size` double
,`sum_x` double
,`sum_square_x` double
,`act_dev` bigint
,`data_dt` string
,`refresh_dt` string)

 

the dimension columns: exp_id,new_user_flag,os,data_dt,refresh_dt

the *sum* measure columns: sum_x,sample_size,sum_square_x,act_dev



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)