You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Quanlong Huang (Code Review)" <ge...@cloudera.org> on 2021/03/15 06:22:22 UTC

[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving

Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17184


Change subject: IMPALA-10483(part-1): Refactor table mask resolving
......................................................................

IMPALA-10483(part-1): Refactor table mask resolving

Table masking(i.e. column-masking/row-filtering) is resolved in the
analyzing phase when resolving the based table/view. The based
table/view is then replaced by a table masking view. After the analyzing
phase, the query may be rewritten, which requires re-analyzing. Reset()
will be called on the AST to reset all the analyzed states before
re-analyze it again. Currently, we consider table masking as the result
of analyzing, so we revert them in FromClause.reset(). This avoids
duplicated masking. However, it may result in a suboptimal plan since no
query rewritters will be applied after the re-analyzing phase. What's
worse, it can't avoid duplicated masking in the query rewriting phase
which will also analyze some rewritten AST. This happens more frequently
in supporting subquery row filters. Since the subquery will be rewritten
into JOIN and being reanalyzed in the rewriter.

This patch refactor the table mask resolving logic to not apply table
mask by default. Only the first analyzing phase requires masking. With
this change, we don't need the complex logic in FromClause.reset() to
revert the masking. Its implementation is reverted to the behavior
before we support column masking.

Tests
 - This is a refactoring so no new tests are added.

Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/WithClause.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
13 files changed, 79 insertions(+), 60 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/17184/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
Gerrit-Change-Number: 17184
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17184 )

Change subject: IMPALA-10483(part-1): Refactor table mask resolving
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8366/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
Gerrit-Change-Number: 17184
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 15 Mar 2021 06:42:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17184 )

Change subject: IMPALA-10483(part-1): Refactor table mask resolving
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17184/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17184/1//COMMIT_MSG@10
PS1, Line 10: base 
> nit: 'base' instead of 'based' here and few other places.
Done


http://gerrit.cloudera.org:8080/#/c/17184/1//COMMIT_MSG@23
PS1, Line 23: This patch refactor the table mask resolving logic to not apply table
> One effect of not applying table mask by default is that now each of the cl
Yeah, I think it's a trade-off. Before this patch we apply table mask by default. Then we need to turn it off in many other places, e.g. in analyze() invoked by statement rewriters. If the programmer forgets to turn it off when adding a new rewriter, we may have duplicate or infinite recursive masks on the table. Agree that test coverage are critical for both side.


http://gerrit.cloudera.org:8080/#/c/17184/1/fe/src/main/java/org/apache/impala/analysis/WithClause.java
File fe/src/main/java/org/apache/impala/analysis/WithClause.java:

http://gerrit.cloudera.org:8080/#/c/17184/1/fe/src/main/java/org/apache/impala/analysis/WithClause.java@66
PS1, Line 66:   @Override
> I would think this would be an @Override of a corresponding method in base 
Yeah, I'm hesitated on this. The base class (StmtNode) is very clean that only has an analyze() method. So I try to limit the changes of this patch. Added it to StmtNode since it causes confusion to you.



-- 
To view, visit http://gerrit.cloudera.org:8080/17184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
Gerrit-Change-Number: 17184
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 16 Mar 2021 09:37:16 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Aman Sinha, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17184

to look at the new patch set (#2).

Change subject: IMPALA-10483(part-1): Refactor table mask resolving
......................................................................

IMPALA-10483(part-1): Refactor table mask resolving

Table masking(i.e. column-masking/row-filtering) is resolved in the
analyzing phase when resolving the base table/view. The base
table/view is then replaced by a table masking view. After the analyzing
phase, the query may be rewritten, which requires re-analyzing. Reset()
will be called on the AST to reset all the analyzed states before
re-analyze it again. Currently, we consider table masking as the result
of analyzing, so we revert them in FromClause.reset(). This avoids
duplicated masking. However, it may result in a suboptimal plan since no
query rewritters will be applied after the re-analyzing phase. What's
worse, it can't avoid duplicated masking in the query rewriting phase
which will also analyze some rewritten AST. This happens more frequently
in supporting subquery row filters. Since the subquery will be rewritten
into JOIN and being reanalyzed in the rewriter.

This patch refactor the table mask resolving logic to not apply table
mask by default. Only the first analyzing phase requires masking. With
this change, we don't need the complex logic in FromClause.reset() to
revert the masking. Its implementation is reverted to the behavior
before we support column masking.

Tests
 - This is a refactoring so no new tests are added.

Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtNode.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/WithClause.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
13 files changed, 84 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/17184/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
Gerrit-Change-Number: 17184
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has abandoned this change. ( http://gerrit.cloudera.org:8080/17184 )

Change subject: IMPALA-10483(part-1): Refactor table mask resolving
......................................................................


Abandoned

Abandon since we have a better solution: https://gerrit.cloudera.org/c/17199
-- 
To view, visit http://gerrit.cloudera.org:8080/17184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
Gerrit-Change-Number: 17184
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17184 )

Change subject: IMPALA-10483(part-1): Refactor table mask resolving
......................................................................


Patch Set 2: Code-Review+2

(2 comments)

Changes LGTM.

http://gerrit.cloudera.org:8080/#/c/17184/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17184/1//COMMIT_MSG@23
PS1, Line 23: This patch refactor the table mask resolving logic to not apply table
> Yeah, I think it's a trade-off. Before this patch we apply table mask by de
Right. I think the test coverage would be more crucial now (for security reasons) compared to before where it may have  at worst recursed.


http://gerrit.cloudera.org:8080/#/c/17184/2/fe/src/main/java/org/apache/impala/analysis/StmtNode.java
File fe/src/main/java/org/apache/impala/analysis/StmtNode.java:

http://gerrit.cloudera.org:8080/#/c/17184/2/fe/src/main/java/org/apache/impala/analysis/StmtNode.java@41
PS2, Line 41:   public void setDoTableMasking(boolean doTableMasking) {}
Ideally, this should be an abstract method too such that derived classes ensure masking is done wherever appropriate..but I can understand why you would want to keep the changes minimal for this patch.



-- 
To view, visit http://gerrit.cloudera.org:8080/17184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
Gerrit-Change-Number: 17184
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 16 Mar 2021 17:15:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17184 )

Change subject: IMPALA-10483(part-1): Refactor table mask resolving
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8375/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
Gerrit-Change-Number: 17184
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 16 Mar 2021 09:56:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17184 )

Change subject: IMPALA-10483(part-1): Refactor table mask resolving
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17184/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17184/1//COMMIT_MSG@10
PS1, Line 10: based
nit: 'base' instead of 'based' here and few other places.


http://gerrit.cloudera.org:8080/#/c/17184/1//COMMIT_MSG@23
PS1, Line 23: This patch refactor the table mask resolving logic to not apply table
One effect of not applying table mask by default is that now each of the clauses (SELECT, FROM, WHERE etc.) needs to explicitly call setDoTableMasking(). If there's a clause that is missing this implementation or if the programmer did not invoke this method on a clause for some reason, the flag will remain false and we could end up not applying the table mask.  But as long as we have good test coverage on the queries, this may be ok.


http://gerrit.cloudera.org:8080/#/c/17184/1/fe/src/main/java/org/apache/impala/analysis/WithClause.java
File fe/src/main/java/org/apache/impala/analysis/WithClause.java:

http://gerrit.cloudera.org:8080/#/c/17184/1/fe/src/main/java/org/apache/impala/analysis/WithClause.java@66
PS1, Line 66:   public void setDoTableMasking(boolean doTableMasking) {
I would think this would be an @Override of a corresponding method in base class but the base class does not have this method..is that intentional ?



-- 
To view, visit http://gerrit.cloudera.org:8080/17184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
Gerrit-Change-Number: 17184
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Mar 2021 06:21:43 +0000
Gerrit-HasComments: Yes