You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Xianqing He (Code Review)" <ge...@cloudera.org> on 2021/07/22 10:32:38 UTC

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Xianqing He has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17712


Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................

IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Create single node plan slowdown in the following form SQL, with about
hundreds of inlineviews to join, and view1, view2... outputs hundreds
of expressions.

select c1 from (select c1, id from view1 where c1 > 10) t1 join (select
c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ...

The reasons for the slow generation of plans are
1. auxiliary predicates are added to GlobalState.conjuncts causing
performance degradation of Analyzer#getUnassignedConjuncts
2. in SingleNodePlanner#createInlineViewPlan the output smap is the
composition of the inline view's smap and the output smap of the inline
view's plan root. Multiple inline view joins cause
ExprSubstitutionMap#compose performance to degrade.

For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered
conjuncts without auxiliary predicate.
For 2, remove expressions from outputSmap that are not used according
to baseSmap.

Testing:
Repro query created single node plan went from 4 sec to less than 1 sec.

Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
3 files changed, 34 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17712/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 1
Gerrit-Owner: Xianqing He <he...@126.com>

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7400/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 10
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Tue, 17 Aug 2021 19:30:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................

IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Creating a single node plan for the following SQL sometime can slowdown,
with about hundreds of inlineviews to join, and view1, view2... outputs
hundreds of expressions.

select c1 from (select c1, id from view1 where c1 > 10) t1 join (select
c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ...

The reasons for the slow generation of plans are as follows
1. Many auxiliary predicates are added to GlobalState.conjuncts causing
performance degradation of Analyzer#getUnassignedConjuncts
2. In SingleNodePlanner#createInlineViewPlan the output smap is the
composition of the inline view's smap and the output smap of the inline
view's plan root. Multiple inline view joins cause
ExprSubstitutionMap#compose performance to degrade.

For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered
conjuncts without auxiliary predicate.
For 2, remove expressions from outputSmap that are not used according
to baseSmap.

Testing:
 Add test tests/query_test/test_query_compilation.py
 Repro query created single node plan went from 2.3 sec to 0.3 sec.

Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Reviewed-on: http://gerrit.cloudera.org:8080/17712
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Qifan Chen <qc...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A tests/query_test/test_query_compilation.py
4 files changed, 74 insertions(+), 4 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Qifan Chen: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 11
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 3:

(2 comments)

Looks very good. Just wonder if we could add a query test to safe guard the reduction in compilation time.

http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@29
PS2, Line 29: Testing:
> I didn't find the right place to add tests and add a repro query here
nit. I wonder if the query mentioned in the commit message can be used as the test query, with an assertion of "Single node plan created" from profile to be less than 1.5s?


http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1218
PS2, Line 1218: pr> nullableR
> I think there is not a way to know the trimming is beneficial in advance, b
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 3
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Tue, 27 Jul 2021 12:53:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7359/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 6
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Fri, 30 Jul 2021 09:21:07 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7340/


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 2
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Jul 2021 15:38:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 10: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 10
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Wed, 18 Aug 2021 13:32:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Xianqing He (Code Review)" <ge...@cloudera.org>.
Xianqing He has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................

IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Creating a single node plan for the following SQL sometime can slowdown,
with about hundreds of inlineviews to join, and view1, view2... outputs
hundreds of expressions.

select c1 from (select c1, id from view1 where c1 > 10) t1 join (select
c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ...

The reasons for the slow generation of plans are as follows
1. Many auxiliary predicates are added to GlobalState.conjuncts causing
performance degradation of Analyzer#getUnassignedConjuncts
2. In SingleNodePlanner#createInlineViewPlan the output smap is the
composition of the inline view's smap and the output smap of the inline
view's plan root. Multiple inline view joins cause
ExprSubstitutionMap#compose performance to degrade.

For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered
conjuncts without auxiliary predicate.
For 2, remove expressions from outputSmap that are not used according
to baseSmap.

Testing:
 Add test tests/query_test/test_query_compilation.py
 Repro query created single node plan went from 2.3 sec to 0.3 sec.

Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A tests/query_test/test_query_compilation.py
4 files changed, 74 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17712/8
-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 8
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9199/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 4
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Wed, 28 Jul 2021 10:32:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 2:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7344/


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 2
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Jul 2021 09:18:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 6: Code-Review+1

Looks great!


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 6
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Wed, 28 Jul 2021 13:16:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9200/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Wed, 28 Jul 2021 10:50:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 10: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 10
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Wed, 18 Aug 2021 01:38:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Xianqing He (Code Review)" <ge...@cloudera.org>.
Xianqing He has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................

IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Creating a single node plan for the following SQL sometime can slowdown,
with about hundreds of inlineviews to join, and view1, view2... outputs
hundreds of expressions.

select c1 from (select c1, id from view1 where c1 > 10) t1 join (select
c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ...

The reasons for the slow generation of plans are as follows
1. Many auxiliary predicates are added to GlobalState.conjuncts causing
performance degradation of Analyzer#getUnassignedConjuncts
2. In SingleNodePlanner#createInlineViewPlan the output smap is the
composition of the inline view's smap and the output smap of the inline
view's plan root. Multiple inline view joins cause
ExprSubstitutionMap#compose performance to degrade.

For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered
conjuncts without auxiliary predicate.
For 2, remove expressions from outputSmap that are not used according
to baseSmap.

Testing:
Performance testing with a query of the following form:
  with aa as (select * from (select * from
  functional.widetable_1000_cols) t where int_col1=10)
  select t1.int_col1 from aa t1 join aa t2
  on t1.int_col1=t2.int_col2
  join aa t3 on t1.int_col1=t3.int_col1 join aa t4
  on t1.int_col1=t4.int_col1
Repro query created single node plan went from 2.3 sec to 0.3 sec.

Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
3 files changed, 32 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17712/3
-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 3
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17712/5/tests/query_test/test_query_compilation.py
File tests/query_test/test_query_compilation.py:

http://gerrit.cloudera.org:8080/#/c/17712/5/tests/query_test/test_query_compilation.py@21
PS5, Line 21: class TestSingleNodePlanCreated(ImpalaTestSuite):
flake8: E302 expected 2 blank lines, found 1



-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Wed, 28 Jul 2021 10:29:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9201/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 6
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Wed, 28 Jul 2021 10:47:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9243/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 8
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Thu, 05 Aug 2021 05:56:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 8: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 8
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Thu, 05 Aug 2021 11:35:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Xianqing He (Code Review)" <ge...@cloudera.org>.
Xianqing He has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................

IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Creating a single node plan for the following SQL sometime can slowdown,
with about hundreds of inlineviews to join, and view1, view2... outputs
hundreds of expressions.

select c1 from (select c1, id from view1 where c1 > 10) t1 join (select
c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ...

The reasons for the slow generation of plans are as follows
1. Many auxiliary predicates are added to GlobalState.conjuncts causing
performance degradation of Analyzer#getUnassignedConjuncts
2. In SingleNodePlanner#createInlineViewPlan the output smap is the
composition of the inline view's smap and the output smap of the inline
view's plan root. Multiple inline view joins cause
ExprSubstitutionMap#compose performance to degrade.

For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered
conjuncts without auxiliary predicate.
For 2, remove expressions from outputSmap that are not used according
to baseSmap.

Testing:
 Add test tests/query_test/test_query_compilation.py
 Repro query created single node plan went from 2.3 sec to 0.3 sec.

Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A tests/query_test/test_query_compilation.py
4 files changed, 72 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17712/5
-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7344/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 2
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Jul 2021 02:29:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9180/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 3
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Tue, 27 Jul 2021 09:30:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7360/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 7
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Sat, 31 Jul 2021 07:54:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7401/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 10
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Tue, 17 Aug 2021 19:32:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9141/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 1
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Jul 2021 11:01:07 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Xianqing He (Code Review)" <ge...@cloudera.org>.
Xianqing He has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 3:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@9
PS2, Line 9: Creating a single node plan for the following SQL sometime
> nit. "Creating a single node plan for the following SQL sometime can slowdo
Done


http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@16
PS2, Line 16: The reasons for the slow generation of plans are as follows
> nit. "are as follows".
Done


http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@17
PS2, Line 17: 1. Many auxiliary predicates are added to GlobalState.conjuncts causing
> nit. "Many auxiliary predicates"
Done


http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@19
PS2, Line 19: I
> nit. In
Done


http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@29
PS2, Line 29: Testing:
> May add some new tests to demonstrate the compilation time reduction.
I didn't find the right place to add tests and add a repro query here


http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@391
PS2, Line 391: conjunctsFromQuery = ne
> nit. Based on how this map is populated, it may be better to rename the map
Done


http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java:

http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@190
PS2, Line 190: ExprSu
> Do we need to handle element not exist exception?
Remove this method


http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1218
PS2, Line 1218: pr> nullableR
> nit. If all expressions on RHS are materialized, then this entire trimming 
I think there is not a way to know the trimming is beneficial in advance, but the cost is negligible compared to ExprSubstitutionMap#compose


http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1209
PS2, Line 1209: if (outputSmap != null) outputSmap.trim(inlineViewRef.getBaseTblSmap(), analyzer);
              :     outputSmap = ExprSubstitutionMap.compose(
              :         outputSmap, rootNode.getOutputSmap(), analyzer);
              :     if (analyzer.isOuterJoined(inlineViewRef.getId())) {
              :       // Exprs against non-matched rows of an outer join should always return NULL.
              :       // Make the rhs exprs of the output smap nullable, if necessary. This expr wrapping
              :       // must be performed on the composed smap, and not on the the inline view's smap,
              :       // because the rhs exprs must first be resolved against the physical output of
              :       // 'planRoot' to correctly determine whether wrapping is necessary.
              :       List<Expr> nullableRhs = TupleIsNullPredicate.wrapExprs(
              :           outputSmap.getRhs(), rootNode.getTupleIds(), analyzer);
              :       outputSmap = new ExprSubstitutionMap(outputSmap.getLhs(), nullableRhs);
              :     }
              :     // Set output smap of rootNode *before* creating a SelectNode for proper resolution.
              :     rootNode.setOutputSmap(outputSmap);
              : 
> nit. Wonder if this block of code can be made a new method as ExprSubstitut
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 3
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Tue, 27 Jul 2021 09:17:43 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7372/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 8
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Thu, 05 Aug 2021 05:35:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7340/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 2
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Jul 2021 09:34:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7360/


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 7
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Sat, 31 Jul 2021 13:53:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Xianqing He (Code Review)" <ge...@cloudera.org>.
Xianqing He has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@29
PS2, Line 29: Testing:
> nit. I wonder if the query mentioned in the commit message can be used as t
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 4
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Wed, 28 Jul 2021 10:10:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 8: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 8
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Thu, 05 Aug 2021 12:56:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17712/4/tests/query_test/test_query_compilation.py
File tests/query_test/test_query_compilation.py:

http://gerrit.cloudera.org:8080/#/c/17712/4/tests/query_test/test_query_compilation.py@18
PS4, Line 18: import pytest
flake8: F401 'pytest' imported but unused


http://gerrit.cloudera.org:8080/#/c/17712/4/tests/query_test/test_query_compilation.py@21
PS4, Line 21: class TestSingleNodePlanCreated(ImpalaTestSuite):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/17712/4/tests/query_test/test_query_compilation.py@38
PS4, Line 38: 
flake8: W292 no newline at end of file



-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 4
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Wed, 28 Jul 2021 10:10:32 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Xianqing He (Code Review)" <ge...@cloudera.org>.
Xianqing He has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................

IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Creating a single node plan for the following SQL sometime can slowdown,
with about hundreds of inlineviews to join, and view1, view2... outputs
hundreds of expressions.

select c1 from (select c1, id from view1 where c1 > 10) t1 join (select
c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ...

The reasons for the slow generation of plans are as follows
1. Many auxiliary predicates are added to GlobalState.conjuncts causing
performance degradation of Analyzer#getUnassignedConjuncts
2. In SingleNodePlanner#createInlineViewPlan the output smap is the
composition of the inline view's smap and the output smap of the inline
view's plan root. Multiple inline view joins cause
ExprSubstitutionMap#compose performance to degrade.

For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered
conjuncts without auxiliary predicate.
For 2, remove expressions from outputSmap that are not used according
to baseSmap.

Testing:
 Add test tests/query_test/test_query_compilation.py
 Repro query created single node plan went from 2.3 sec to 0.3 sec.

Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A tests/query_test/test_query_compilation.py
4 files changed, 72 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17712/6
-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 6
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Xianqing He (Code Review)" <ge...@cloudera.org>.
Xianqing He has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................

IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Creating a single node plan for the following SQL sometime can slowdown,
with about hundreds of inlineviews to join, and view1, view2... outputs
hundreds of expressions.

select c1 from (select c1, id from view1 where c1 > 10) t1 join (select
c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ...

The reasons for the slow generation of plans are as follows
1. Many auxiliary predicates are added to GlobalState.conjuncts causing
performance degradation of Analyzer#getUnassignedConjuncts
2. In SingleNodePlanner#createInlineViewPlan the output smap is the
composition of the inline view's smap and the output smap of the inline
view's plan root. Multiple inline view joins cause
ExprSubstitutionMap#compose performance to degrade.

For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered
conjuncts without auxiliary predicate.
For 2, remove expressions from outputSmap that are not used according
to baseSmap.

Testing:
 Add test tests/query_test/test_query_compilation.py
 Repro query created single node plan went from 2.3 sec to 0.3 sec.

Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A tests/query_test/test_query_compilation.py
4 files changed, 70 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17712/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 4
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 2:

(9 comments)

Looks good!

http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@9
PS2, Line 9: Create single node plan slowdown in the following form SQL
nit. "Creating a single node plan for the following SQL sometime can slowdown"


http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@16
PS2, Line 16: The reasons for the slow generation of plans are
nit. "are as follows".


http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@17
PS2, Line 17: 1. auxiliary predicates are added to GlobalState.conjuncts causing
nit. "Many auxiliary predicates"


http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@19
PS2, Line 19: i
nit. In


http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@29
PS2, Line 29: Testing:
May add some new tests to demonstrate the compilation time reduction.


http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@391
PS2, Line 391: conjunctsWithoutAuxExpr
nit. Based on how this map is populated, it may be better to rename the map as conjunctsFromQuery.


http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java:

http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@190
PS2, Line 190: remove
Do we need to handle element not exist exception?


http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1218
PS2, Line 1218: (!analyzer.ge
nit. If all expressions on RHS are materialized, then this entire trimming operation is a no-op and could be expensive. Is there a way to know the trimming is beneficial in advance?


http://gerrit.cloudera.org:8080/#/c/17712/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1209
PS2, Line 1209: if (outputSmap != null) {
              :       // Remove expressions from outputSmap that are not used according to baseSmap,
              :       // in order to optimize the performance of ExprSubstitutionMap#compose
              :       ExprSubstitutionMap baseSmap = inlineViewRef.getBaseTblSmap();
              :       Preconditions.checkState(outputSmap.size() == baseSmap.size());
              :       for (int i = outputSmap.size() - 1; i >= 0; --i) {
              :         List<SlotId> slotIds = new ArrayList<>();
              :         baseSmap.getRhs().get(i).getIds(null, slotIds);
              :         for (SlotId id: slotIds) {
              :           if (!analyzer.getSlotDesc(id).isMaterialized()) {
              :             outputSmap.remove(i);
              :             break;
              :           }
              :         }
              :       }
              :     }
nit. Wonder if this block of code can be made a new method as ExprSubstituteMap::trim(ExprSubstitutionMap baseTblSMap).



-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 2
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Jul 2021 14:57:01 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 )

Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
......................................................................


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7359/


-- 
To view, visit http://gerrit.cloudera.org:8080/17712
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a
Gerrit-Change-Number: 17712
Gerrit-PatchSet: 6
Gerrit-Owner: Xianqing He <he...@126.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xianqing He <he...@126.com>
Gerrit-Comment-Date: Fri, 30 Jul 2021 15:34:56 +0000
Gerrit-HasComments: No