You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/02/07 09:57:00 UTC

[jira] [Commented] (KYLIN-4888) Performance optimization of union query with spark engine

    [ https://issues.apache.org/jira/browse/KYLIN-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280428#comment-17280428 ] 

ASF GitHub Bot commented on KYLIN-4888:
---------------------------------------

hit-lacus merged pull request #1562:
URL: https://github.com/apache/kylin/pull/1562


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


>  Performance optimization of union query with spark engine
> ----------------------------------------------------------
>
>                 Key: KYLIN-4888
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4888
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Spark Engine
>    Affects Versions: v4.0.0-alpha
>            Reporter: Feng Zhu
>            Assignee: Feng Zhu
>            Priority: Major
>             Fix For: v4.0.0-GA
>
>         Attachments: spark_union_plan_comparison, stages before.png, stages_after.png
>
>
> when using union query with spark engine, UnionPlan transforms OLAPUnionRel to spark
> DataFrame, when OLAPUnionRel.all = false, distinct transformation of spark will be used, but
> it's used in a loop which traversing the DataFrame collection so that we don't have an excepted optimized flattenUnion plan(the CombineUnions rule of spark optimize the distinct, but the nested union plan does not be flattened),there are so many stages in spark dag.  Actuall, distinct transformation should be used only once at last.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)