You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Richard Zhang (Jira)" <ji...@apache.org> on 2021/11/25 07:06:00 UTC
[jira] [Commented] (KYLIN-1467) subquery optimization: remove unused subquery output columns
[ https://issues.apache.org/jira/browse/KYLIN-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448971#comment-17448971 ]
Richard Zhang commented on KYLIN-1467:
--------------------------------------
I wonder if this problem has been solved? If it is solved, please tell me how it is solved? Thank you.
> subquery optimization: remove unused subquery output columns
> -------------------------------------------------------------
>
> Key: KYLIN-1467
> URL: https://issues.apache.org/jira/browse/KYLIN-1467
> Project: Kylin
> Issue Type: Improvement
> Components: Query Engine
> Affects Versions: v1.2
> Reporter: chelubai
> Priority: Minor
> Labels: newbie
>
> Table 'tb' has more than 10,000,000 rows;
> {code:sql}
> -- sql-1:
> select a from tb group by a;
> {code}
> The sql-1 result has only 3 rows:
> {code:sql}
> v1
> v2
> v3
> {code}
> When query kylin through BI tool tableau, the generated sql:
> {code:sql}
> -- sql-2:
> SELECT "X___SQL___"."a" AS "a"
> FROM (
> select a,b,c,d
> from tb
> group by a,b,c,d )
> "X___SQL___"
> GROUP BY "X___SQL___"."a"
> {code}
> the sql-1 and sql-2 is equivalent, but sql-2 fails with message:
> Scan row count exceeded threshold: 1000000, please add filter condition to narrow down backend scan range, like where clause.
> one solution:
> Add a calcite planner optimization rule to kylin-query, to remove the columns of the subquery that is not used outside, both from select, group by and order by.
> so the sql-2 may be optimized to:
> {code:sql}
> SELECT "X___SQL___"."a" AS "a"
> FROM (
> select a
> from tb
> group by a )
> "X___SQL___"
> GROUP BY "X___SQL___"."a"
> {code}
> This sql run successfully.
> Donot know if the exiting products like mysql/oracle/hive/presto/spark... have this optimization or not.
> Any comment is appreciated.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)