You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Shaofeng SHI (JIRA)" <ji...@apache.org> on 2017/10/31 03:24:00 UTC

[jira] [Updated] (KYLIN-2773) Should not push down join condition related columns are compatible while not consistent

     [ https://issues.apache.org/jira/browse/KYLIN-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shaofeng SHI updated KYLIN-2773:
--------------------------------
    Fix Version/s:     (was: v2.2.0)
                   v2.3.0

> Should not push down join condition related columns are compatible while not consistent
> ---------------------------------------------------------------------------------------
>
>                 Key: KYLIN-2773
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2773
>             Project: Kylin
>          Issue Type: Bug
>          Components: Query Engine
>            Reporter: Zhong Yanghong
>            Assignee: Zhong Yanghong
>             Fix For: v2.3.0
>
>         Attachments: APACHE-KYLIN-2773.patch
>
>
> For sql,
> {code}
> select PART_DT, META_CATEG_NAME, sum(price) 
> from KYLIN_SALES 
> INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID
> AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID
> INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT
> group by PART_DT, META_CATEG_NAME
> order by PART_DT, META_CATEG_NAME
> {code}
> the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer.
> Then the plan transformed by kylin is as follows:
> {code}
> OLAPToEnumerableConverter 
>   OLAPLimitRel(fetch=[50000]) 
>     OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) 
>       OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)]) 
>         OLAPProjectRel(PART_DT=[$0], META_CATEG_NAME=[$16], PRICE=[$5]) 
>           OLAPJoinRel(condition=[=($0, $19)], joinType=[inner]) 
> {code}
> {color:#f79232}
> {code}
>             OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], UPD_USER=[$t13], LEAF_CATEG_ID0=[$t14], SITE_ID=[$t15], META_CATEG_NAME=[$t16], CATEG_LVL2_NAME=[$t17], CATEG_LVL3_NAME=[$t18])
>               OLAPJoinRel(condition=[AND(=($3, $19), =($4, $15))], joinType=[inner])
>                 OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
>                 OLAPProjectRel(expr#0..8=[{inputs}], expr#9=[CAST($t4):BIGINT], USER_DEFINED_FIELD1=[$t0], USER_DEFINED_FIELD3=[$t1], UPD_DATE=[$t2], UPD_USER=[$t3], LEAF_CATEG_ID=[$t4], SITE_ID=[$t5], META_CATEG_NAME=[$t6], CATEG_LVL2_NAME=[$t7], CATEG_LVL3_NAME=[$t8], LEAF_CATEG_ID9=[$t9])
> {code}
> {color}
> {code}
>                   OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]])
>             OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, 3]])
> {code} 
> However, what we expect is as follows:
> {code}
> OLAPToEnumerableConverter
>   OLAPLimitRel(fetch=[50000])
>     OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
>       OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)])
>         OLAPProjectRel(expr#0..21=[{inputs}], PART_DT=[$t0], META_CATEG_NAME=[$t17], PRICE=[$t4])
>           OLAPJoinRel(condition=[=($0, $19)], joinType=[inner]) 
> {code}
> {color:#f79232}
> {code}
>             OLAPJoinRel(condition=[AND(=($2, $10), =($3, $11))], joinType=[inner]) 
>               OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) 
> {code}
> {color}
> {code}
>               OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) 
>             OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, 3, 4]])
> {code}
> The reason for this difference is as follows:
> * Although we remove the {{JoinPushExpressionsRule}} in {{OLAPTableScan}}, the method {{RelOptUtil.pushDownJoinConditions()}} is still invoked when creating a join in {{SqlToRelConverter}}.
> * In the method of {{RelOptUtil.pushDownJoinConditions()}}, since the datatypes of the join related columns are not the same, *cast* function will be automatically assigned to KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID. Then a {{OLAPProjectRel}} will be introduced.
> In kylin, we don't need {{RelOptUtil.pushDownJoinConditions()}}. Therefore, the solution for this is just remove that logic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)