You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/08/18 14:34:00 UTC

[jira] [Commented] (IMPALA-10806) Create single node plan is slow when hundreds of inline views are joined

    [ https://issues.apache.org/jira/browse/IMPALA-10806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401102#comment-17401102 ] 

ASF subversion and git services commented on IMPALA-10806:
----------------------------------------------------------

Commit caea9d3e14bfe4140221cc22721fb0b038345e9d in impala's branch refs/heads/master from xqhe
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=caea9d3 ]

IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Creating a single node plan for the following SQL sometime can slowdown,
with about hundreds of inlineviews to join, and view1, view2... outputs
hundreds of expressions.

select c1 from (select c1, id from view1 where c1 > 10) t1 join (select
c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ...

The reasons for the slow generation of plans are as follows
1. Many auxiliary predicates are added to GlobalState.conjuncts causing
performance degradation of Analyzer#getUnassignedConjuncts
2. In SingleNodePlanner#createInlineViewPlan the output smap is the
composition of the inline view's smap and the output smap of the inline
view's plan root. Multiple inline view joins cause
ExprSubstitutionMap#compose performance to degrade.

For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered
conjuncts without auxiliary predicate.
For 2, remove expressions from outputSmap that are not used according
to baseSmap.

Testing:
 Add test tests/query_test/test_query_compilation.py
 Repro query created single node plan went from 2.3 sec to 0.3 sec.

Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Reviewed-on: http://gerrit.cloudera.org:8080/17712
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Qifan Chen <qc...@cloudera.com>


> Create single node plan is slow when hundreds of inline views are joined
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-10806
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10806
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.2.0, Impala 4.0.0
>            Reporter: Xianqing He
>            Assignee: Xianqing He
>            Priority: Major
>             Fix For: Impala 4.1.0
>
>
> The sql form like
> {code:java}
> select c1 from (select c1, id from view1 where c1>10) t1 join (select c2, id from view2 where c1>10) t2 on t1.id=t2.id join ...{code}
> Query Compilation: 3s642ms
>  - Metadata of all 90 tables cached: 1.757ms (1.757ms)
>  - Analysis finished: 229.621ms (227.863ms)
>  - Authorization finished (noop): 235.024ms (5.402ms)
>  - Value transfer graph computed: 275.756ms (40.731ms)
>  - Single node plan created: 3s518ms ({color:#FF0000}3s242ms{color})
>  - Runtime filters computed: 3s625ms (106.697ms)
>  - Distributed plan created: 3s625ms (424.022us)
>  - Planning finished: 3s642ms (17.134ms)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org