You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "hcy (Jira)" <ji...@apache.org> on 2020/11/18 06:26:00 UTC

[jira] [Updated] (KYLIN-4823) Push down having filter error when group by dynamic column

     [ https://issues.apache.org/jira/browse/KYLIN-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

hcy updated KYLIN-4823:
-----------------------
    Description: 
如果cube只有一个segment,且shard by的列存在于group by中,满足having filter push down的条件时,如果group by中存在动态列,并且case when then 中的表达是为column而不是常量时会报数组越界的错误。配置kylin.query.enable-dynamic-column=true无效,也会报错。

测试Cube如下:

模型为kylin example中的kylin_sales_model,cube为kylin_sales_cube,为了重现错误把BUYER_ID的rowkey设置为shard by

测试SQL如下:

SELECT BUYER_ID,
 CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END AS dyna_group,
 SUM(PRICE)
 FROM KYLIN_SALES
 GROUP BY
 BUYER_ID,
 CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END
 HAVING SUM(PRICE)>10

报错如下:

{color:#b94a48}Index: 4, Size: 1 while executing SQL: "select * from (SELECT BUYER_ID, CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END AS dyna_group, SUM(PRICE) FROM KYLIN_SALES GROUP BY BUYER_ID, CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END HAVING SUM(PRICE)>10) limit 50000"{color}

Caused by: java.lang.IndexOutOfBoundsException: Index: 4, Size: 1
 at java.util.ArrayList.rangeCheck(ArrayList.java:657)
 at java.util.ArrayList.get(ArrayList.java:433)
 at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:552)
 at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:189)
 at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:89)
 at org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117)
 at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60)

  was:
如果cube只有一个segment,且shard by的列存在于group by中,满足having filter push down的条件时,如果group by中存在动态列,且case when then 中的表达是为column而不是常量时会报数据越界的错误。配置kylin.query.enable-dynamic-column=true无效,也会报错。

测试Cube如下:

模型为kylin example中的kylin_sales_model,cube为kylin_sales_cube,为了重现错误把BUYER_ID的rowkey设置为shard by

测试SQL如下:

SELECT BUYER_ID,
CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END AS dyna_group,
SUM(PRICE)
FROM KYLIN_SALES
GROUP BY
BUYER_ID,
CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END
HAVING SUM(PRICE)>10

报错如下:

{color:#b94a48}Index: 4, Size: 1 while executing SQL: "select * from (SELECT BUYER_ID, CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END AS dyna_group, SUM(PRICE) FROM KYLIN_SALES GROUP BY BUYER_ID, CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END HAVING SUM(PRICE)>10) limit 50000"{color}

Caused by: java.lang.IndexOutOfBoundsException: Index: 4, Size: 1
 at java.util.ArrayList.rangeCheck(ArrayList.java:657)
 at java.util.ArrayList.get(ArrayList.java:433)
 at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:552)
 at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:189)
 at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:89)
 at org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117)
 at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60)


> Push down having filter error  when group by dynamic column
> -----------------------------------------------------------
>
>                 Key: KYLIN-4823
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4823
>             Project: Kylin
>          Issue Type: Bug
>          Components: Query Engine
>    Affects Versions: v3.1.0
>            Reporter: hcy
>            Priority: Major
>
> 如果cube只有一个segment,且shard by的列存在于group by中,满足having filter push down的条件时,如果group by中存在动态列,并且case when then 中的表达是为column而不是常量时会报数组越界的错误。配置kylin.query.enable-dynamic-column=true无效,也会报错。
> 测试Cube如下:
> 模型为kylin example中的kylin_sales_model,cube为kylin_sales_cube,为了重现错误把BUYER_ID的rowkey设置为shard by
> 测试SQL如下:
> SELECT BUYER_ID,
>  CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END AS dyna_group,
>  SUM(PRICE)
>  FROM KYLIN_SALES
>  GROUP BY
>  BUYER_ID,
>  CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END
>  HAVING SUM(PRICE)>10
> 报错如下:
> {color:#b94a48}Index: 4, Size: 1 while executing SQL: "select * from (SELECT BUYER_ID, CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END AS dyna_group, SUM(PRICE) FROM KYLIN_SALES GROUP BY BUYER_ID, CASE WHEN LSTG_SITE_ID > 1 then LSTG_SITE_ID else LEAF_CATEG_ID END HAVING SUM(PRICE)>10) limit 50000"{color}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 4, Size: 1
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657)
>  at java.util.ArrayList.get(ArrayList.java:433)
>  at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:552)
>  at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:189)
>  at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:89)
>  at org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117)
>  at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)