You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "hongbin ma (JIRA)" <ji...@apache.org> on 2016/08/29 10:10:21 UTC

[jira] [Commented] (KYLIN-1954) BuildInFunctionTransformer should be executed per CubeSegmentScanner

    [ https://issues.apache.org/jira/browse/KYLIN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15445454#comment-15445454 ] 

hongbin ma commented on KYLIN-1954:
-----------------------------------

CubeStorageQuery.search/ CubeSegmentScanner

when filter is translated for the first segment, filter is changed to
CompareTupleFilter(IN clause)
translate will not triger for the next segments.
this is not right because dictionary is not same for every segments.

assume data like this:

merchant_name  cube segment
深海新创专营          20160725
深海新创手机          20160726

when search with like '%深海新创%'
CubeSegmentScanner scan segment '20160725' , and filter is changed to in
clause( IN '深海新创专营')
result is right for this segment ,but not for the next segments because
filter now has been changed.

> BuildInFunctionTransformer should be executed per CubeSegmentScanner
> --------------------------------------------------------------------
>
>                 Key: KYLIN-1954
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1954
>             Project: Kylin
>          Issue Type: Improvement
>    Affects Versions: v1.5.3
>            Reporter: hongbin ma
>            Assignee: hongbin ma
>
> reported from dev mail list "Question abount BuildInFunctionTransformer"
> Sorry for the wrong description and thanks for the explaination.
> I have another question on this.
> Case1
> select merchant_name,dt_day,count(*)
> from session_view_shop_0
> where merchant_name like '%深海新创手机%'
> and dt_year='2016'
> and dt_month='07'
> and dt_day >='25'
> and dt_day <='28'
> group by merchant_name,dt_day
> 2016-08-05 09:25:06,263 INFO  [http-bio-7070-exec-10] dict.BuildInFunctionTransformer:66 : Translated {LIKE(KYLIN_REPORT_DB.SESSION_
> VIEW_SHOP_0.MERCHANT_NAME,%深海新创手机%)} to IN clause: {KYLIN_REPORT_DB.SESSION_VIEW_SHOP_0.MERCHANT_NAME IN []}
> Result1
> 深海新创手机专营店80002972 28 6360
> 深海新创手机专营店80002972 27 5501
> 深海新创手机专营店80002972 26 4830
> Case 2
> select merchant_name,dt_day,count(*)
> from session_view_shop_0
> where merchant_name like '%深海新创%'
> and dt_year='2016'
> and dt_month='07'
> and dt_day >='25'
> and dt_day <='28'
> group by merchant_name,dt_day
> 2016-08-05 09:37:55,469 INFO  [http-bio-7070-exec-15] dict.BuildInFunctionTransformer:66 : Translated {LIKE(KYLIN_REPORT_DB.SESSION_
> VIEW_SHOP_0.MERCHANT_NAME,%深海新创%)} to IN clause: {KYLIN_REPORT_DB.SESSION_VIEW_SHOP_0.MERCHANT_NAME IN [深海新创专营店80002972]}
> Result2
> 深海新创专营店80002972 25 5283
> ’深海新创手机专营店80002972’ is expected in result2 , as it exists which case1 shows.
> CubeStorageQuery.search/ CubeSegmentScanner
> when filter is translated for the first segment, filter is changed to
> CompareTupleFilter(IN clause)
> translate will not triger for the next segments.
> this is not right because dictionary is not same for every segments.
> assume data like this:
> merchant_name  cube segment
> 深海新创专营          20160725
> 深海新创手机          20160726
> when search with like '%深海新创%'
> CubeSegmentScanner scan segment '20160725' , and filter is changed to in
> clause( IN '深海新创专营')
> result is right for this segment ,but not for the next segments because
> filter now has been changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)