You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/12/01 01:44:00 UTC

[jira] [Commented] (KYLIN-4823) Push down having filter error when group by dynamic column

    [ https://issues.apache.org/jira/browse/KYLIN-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241188#comment-17241188 ] 

ASF GitHub Bot commented on KYLIN-4823:
---------------------------------------

hit-lacus edited a comment on pull request #1493:
URL: https://github.com/apache/kylin/pull/1493#issuecomment-735928205


   ## Reproduce SQL
   
   ```sql
   SELECT SELLER_ID,
       CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID + 0.3 else LSTG_SITE_ID END AS dyna_group,
       SUM(PRICE)
   FROM KYLIN_SALES
   GROUP BY SELLER_ID,
       CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID + 0.3 else LSTG_SITE_ID END
   HAVING SUM(PRICE)>10
   ```
   
   
   ### Cause Analysis
   
   1. Having Filter is wrong, should be `price` not `LSTG_SITE_ID` ! Let's see what happen ?
   <img width="1127" alt="image" src="https://user-images.githubusercontent.com/14030549/100642316-993b7b80-3373-11eb-88cf-b1a7db7044e8.png">
   
   2. Fetch duplicate column in TupleExpression, two `LSTG_SITE_ID`.
   <img width="842" alt="image" src="https://user-images.githubusercontent.com/14030549/100642470-c556fc80-3373-11eb-9601-0e2ac26f3c90.png">
   
   3. Add same column(`LSTG_SITE_ID`) to groupBy column twice.
   <img width="857" alt="image" src="https://user-images.githubusercontent.com/14030549/100642615-f1727d80-3373-11eb-977a-a530e2ab36f6.png">
   
   4. Start to create a having TupleFilter
   <img width="1042" alt="image" src="https://user-images.githubusercontent.com/14030549/100642787-31d1fb80-3374-11eb-9acc-babce32c535b.png">
   
   5. `$2` is wrong because 
   <img width="1210" alt="image" src="https://user-images.githubusercontent.com/14030549/100642872-5201ba80-3374-11eb-872f-b44c46e85b80.png">
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Push down having filter error  when group by dynamic column
> -----------------------------------------------------------
>
>                 Key: KYLIN-4823
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4823
>             Project: Kylin
>          Issue Type: Bug
>          Components: Query Engine
>    Affects Versions: v3.1.0
>            Reporter: hcy
>            Priority: Major
>
> 如果cube只有一个segment,且shard by的列存在于group by中,满足having filter push down的条件时,如果group by中存在动态列,并且case when then 中的表达是为column而不是常量时会报数组越界的错误。配置kylin.query.enable-dynamic-column=true无效,也会报错。
> 测试Cube如下:
> 模型为kylin example中的kylin_sales_model,cube为kylin_sales_cube,为了重现错误把BUYER_ID的rowkey设置为shard by
> 测试SQL如下:
> SELECT BUYER_ID,
>  CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END AS dyna_group,
>  SUM(PRICE)
>  FROM KYLIN_SALES
>  GROUP BY
>  BUYER_ID,
>  CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END
>  HAVING SUM(PRICE)>10
> 报错如下:
> {color:#b94a48}Index: 4, Size: 1 while executing SQL: "select * from (SELECT BUYER_ID, CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END AS dyna_group, SUM(PRICE) FROM KYLIN_SALES GROUP BY BUYER_ID, CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END HAVING SUM(PRICE)>10) limit 50000"{color}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 4, Size: 1
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657)
>  at java.util.ArrayList.get(ArrayList.java:433)
>  at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:552)
>  at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:189)
>  at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:89)
>  at org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117)
>  at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)