You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/05/06 07:00:00 UTC

[jira] [Commented] (KYLIN-5536) Kylin query optimization, by limiting the data range of max query, improve query efficiency

    [ https://issues.apache.org/jira/browse/KYLIN-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720130#comment-17720130 ] 

ASF subversion and git services commented on KYLIN-5536:
--------------------------------------------------------

Commit 5d251e313959b13cb15c2789eb69b6bb9b72c6b5 in kylin's branch refs/heads/kylin5 from sibingzhang
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=5d251e3139 ]

KYLIN-5536 Limit the segment range of MAX query to improve query performance

Co-authored-by: sibing.zhang <si...@qq.com>


> Kylin query optimization, by limiting the data range of max query, improve query efficiency
> -------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-5536
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5536
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Query Engine
>    Affects Versions: 5.0-alpha
>            Reporter: Yaguang Jia
>            Assignee: Yaguang Jia
>            Priority: Major
>             Fix For: 5.0-beta
>
>
> h2. Dev design
> 1、Add configuration kylin.query.max-measure-segment-pruner-before-days
> Limit the time range of the query. The default value is -1, which is equivalent to turning off this optimization. When configured to 0, no data is scanned. When the configuration parameter is incorrect (e.g. 0.1), the effect is to not turn on the switch. Includes three levels: model, project, and system, in decreasing order of priority.
> 2、Where will the optimization be done?
> segment pruner at: org.apache.kylin.query.routing.RealizationPruner#pruneSegments
> 3、What kind of queries will be optimized?
> select <max(partDT)> from T [where xxx]
> The query must be max(time partitioned column; where condition is optional; no group by column
> 4、When configuration parameters are specified, which segment is selected to answer the query?
> From the last (new) segment, the segment is selected according to the configuration time.
> h3. dev design
> h4. 1、新增配置 {{kylin.query.max-measure-segment-pruner-before-days}}
> 用于限定查询时的时间范围。默认值为-1,相当于关闭此优化。当配置为0时,不扫描数据。当配置参数不对(比如0.1)时,效果为不打开开关。包括模型、项目、系统三个级别,优先级依次降低。
> *2、将优化做在哪?*
> segment pruner处:org.apache.kylin.query.routing.RealizationPruner#pruneSegments
> *3、什么样的查询会被优化?*
> select <max(partDT)> from T [where xxx]
> 查询必须是max(时间分区列;where 条件可有可无;不能有group by 列
> *4、当指定了配置参数时,选择哪些segment来回答查询?*
> 从最后(新)一个segment起,按照配置时间选择segment。



--
This message was sent by Atlassian Jira
(v8.20.10#820010)