You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Quanlong Huang (Code Review)" <ge...@cloudera.org> on 2021/01/25 02:35:07 UTC

[Impala-ASF-CR] [WIP] IMPALA-9234: Support Ranger row filtering policies

Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16976


Change subject: [WIP] IMPALA-9234: Support Ranger row filtering policies
......................................................................

[WIP] IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

TODO: refactor some common codes with column masking
TODO: add more tests (including audit tests)

Tests:
 - Add FE test for error message when disabling row filtering
 - Add e2e test with row filtering policies
 - Add e2e test with column masking and row filtering policies both take
   place

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
16 files changed, 209 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/1
-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6982/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 12
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Mar 2021 15:26:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java
File fe/src/main/java/org/apache/impala/authorization/TableMask.java:

http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java@109
PS9, Line 109:     String stmtSql = String.format("SELECT * FROM %s.%s WHERE %s",
> I think you mean IMPALA-9223. It's commented in InlineViewRef.createTableMa
Update: I'm going to fix IMPALA-9661 so don't need IMPALA-9223



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Mar 2021 09:09:29 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 2:

(4 comments)

Thanks Tim's comments! This patch is ready for review now.

Also added Csaba and Fang-Yu as reviewers since you are involved in the work of column masking. Hope you can take a look when you have time. Thanks!

http://gerrit.cloudera.org:8080/#/c/16976/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16976/1//COMMIT_MSG@33
PS1, Line 33: Tests:
> I think we need to add some more tests for different row filter types, e.g.
Yes, UDFs are allowed. Ideally, we should support all kinds of expressions. However, after adding more tests, I found two hard cases and would support them in follow-up patches:

1) Expressions that contains subqueries on other tables. Those tables can't be resolved correctly since table metadata loading is done before analyzing. E.g. for table "functional.alltypesagg" we can create a row-filter policy with expr: "id in (select id from functional.alltypestiny)". It's expected to transform query "select * from functional.alltypesagg" to "select * from functional.alltypesagg where id in (select id from functional.alltypestiny)". The query fails since the analyzer can't resolve "functional.alltypestiny" which is not in the StmtTableCache. Filed IMPALA-10483.

2) Expressions tables containing nested types and the query uses unrelated collection column directly. We need to rewrite the query to apply the table masking view on the base table. Filed IMPALA-10484.


http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/TableMask.java
File fe/src/main/java/org/apache/impala/authorization/TableMask.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/TableMask.java@102
PS1, Line 102:   /**
> We might need to be a bit careful with testing invalid row filters that don
Yeah, but we can only detect syntax errors here. For semantic errors like return type is not BOOLEAN, we need to detect it later. I'll add some tests later.


http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@397
PS1, Line 397:         maskedColumn = maskedValue.replace("{col}", columnName);
> Can we factor out this code and share it with the similar logic above? This
Done. We still need to revisit whether row filtering policies will produce stale events (like column masking policies).


http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java@56
PS1, Line 56:    * Stash and deduplicate the audit events produced by table masking (Column-masking /
> This comment probably needs an update. I find this method a bit confusing o
Done. Refactor these to be more general.

@Fang-Yu Please help to reivew them since you are the author :)



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Feb 2021 12:44:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 10:

(4 comments)

Thank Aman!

http://gerrit.cloudera.org:8080/#/c/16976/9/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/16976/9/be/src/util/backend-gflag-util.cc@158
PS9, Line 158: DEFINE_bool(enable_row_filtering, true,
> Since enabling this flag depends on enabling column masking, it would be us
Done


http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java
File fe/src/main/java/org/apache/impala/authorization/TableMask.java:

http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java@109
PS9, Line 109:     String stmtSql = String.format("SELECT * FROM %s.%s WHERE %s",
> My understanding is the SELECT * will need to be changed to only project co
I think you mean IMPALA-9223. It's commented in InlineViewRef.createTableMaskView() where we actually create the table mask view.

This query is just for parsing the row filter string into AST. Note that only the WHERE clause is returned. So "SELECT *" is used for simplicity.


http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java@111
PS9, Line 111:     SelectStmt selectStmt = (SelectStmt) Parser.parse(stmtSql);
> Since this will throw an AnalysisException (or ParseException) for a malfor
Yes, good question. For example if table functional.alltypestiny has an illegal row filter like "10 id = int_col", query on it will fail as

 [localhost:21050] default> select * from functional.alltypestiny;
 Query: select * from functional.alltypestiny
 Query submitted at: 2021-03-17 14:46:27 (Coordinator: http://quanlong-OptiPlex-BJ:25000)
 ERROR: ParseException: Syntax error in line 1:
 SELECT * FROM functional.alltypestiny WHERE 10 id = int_col
                                                ^
 Encountered: IDENTIFIER
 Expected: AND, BETWEEN, DIV, EXCEPT, GROUP, HAVING, ILIKE, IN, INTERSECT, IREGEXP, IS, LIKE, LIMIT, ||, MINUS, NOT, OFFSET, OR, ORDER, REGEXP, RLIKE, UNION

 CAUSED BY: Exception: Syntax error

I tried it in Hive as well, the failure is similar:

 0: jdbc:hive2://localhost:11050> select * from functional.alltypestiny;
 Error: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.parse.ParseException: line 1:484 missing ) at 'id' near 'id'
 line 1:487 missing EOF at '=' near 'id' (state=42000,code=40000)

We will expose more details than Hive since we have a more friendly error message. But it can help user to correct the error. I'm not sure whether we should swallow the details. I think it's undefined on the error behavior for illegal row filter. Do you think we should file a JIRA for both Hive and Impala?


http://gerrit.cloudera.org:8080/#/c/16976/9/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/9/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@1
PS9, Line 1: ====
> Thanks for adding these tests.  I had a couple of suggestions to the test c
Good questions!

1. The test at line 67 is a related test. Currently we don't have a formatted error message. The raw cause is exposed.
2. Done for WITH clause. Borrowing some tests in ranger_column_masking.test. For the VIEWs, the tests on alltypes_view and masked_view are related. Should I add more?
3. Currently both Hive and Impala will show the real plan containing the masking expressions. So Impala runtime profile will has them as well. I think it's ok to have them, which helps the end user to understand the query results.



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Mar 2021 08:08:25 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8353/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sat, 13 Mar 2021 03:34:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 9:

(4 comments)

A few comments below. Overall this looks good.

http://gerrit.cloudera.org:8080/#/c/16976/9/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/16976/9/be/src/util/backend-gflag-util.cc@158
PS9, Line 158: DEFINE_bool(enable_row_filtering, true,
Since enabling this flag depends on enabling column masking, it would be useful to show something like 'enabling this flag requires enable_column_masking to be true'.


http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java
File fe/src/main/java/org/apache/impala/authorization/TableMask.java:

http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java@109
PS9, Line 109:     String stmtSql = String.format("SELECT * FROM %s.%s WHERE %s",
My understanding is the SELECT * will need to be changed to only project columns referenced in the query. If that's true and there's an existing JIRA for it, you can reference it here.


http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java@111
PS9, Line 111:     SelectStmt selectStmt = (SelectStmt) Parser.parse(stmtSql);
Since this will throw an AnalysisException (or ParseException) for a malformed row filter, would that accidentally show the column name ?


http://gerrit.cloudera.org:8080/#/c/16976/9/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/9/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@1
PS9, Line 1: ====
Thanks for adding these tests.  I had a couple of suggestions to the test coverage and a question about EXPLAIN:
1. A test for AnalyticException/ParseException for a malformed row filter.  What is actually shown in such a error message and do we need to hide anything in the message.  
2. Since WITH clause and VIEWs are expected to also be enforcing the row filter, could you add one test for each ?
3. (this is more of a question) What about EXPLAIN ? Will an EXPLAIN for a query that has table masking applied (either row/column masking or both) show the rewritten query plan ?  What about the runtime query profile ?
Depending on the expected behavior, we should probably add a sanity test.



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Mar 2021 01:49:28 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
File fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@175
PS2, Line 175:     // TODO: Column-masking/Row-filtering expressions may have subqueries which may
Can you file a JIRA for this? I guess it is a fairly plausible use case for this - I found an example here - https://cwiki.apache.org/confluence/display/RANGER/Row-level+filtering+and+column-masking+using+Apache+Ranger+policies+in+Apache+Hive

I think we might also have to be careful about exactly what subqueries are allowed. E.g. if would be weird if you had a correlated references between the row filter subquery and the outer query, or if you could have a relative table reference that referred to different tables depending on which DB it was executed in.

edit: found the JIRA IMPALA-10483


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@473
PS2, Line 473: when row filter is disabled
This is inaccurate - this method doesn't check whether row filtering is enabled.


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@476
PS2, Line 476: authorizeRowFilter
Rename this method to reflect new behaviour, e.g. throwIfRowFilteringRequired()


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java@139
PS2, Line 139:       // TODO: Whether we should use lowercase or uppercase accessType?
How would we decide this? What does Hive do?


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@454
PS2, Line 454:     String databaseName = "functional";
Add a brief comment describing the row filter policies added.


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@502
PS2, Line 502:       authzOk(events -> {
Can you comment why we get the row filter event in one case but not the other (I think it's because of the user it's applied to right).


http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test@3
PS2, Line 3: # Row-filtering policy keeps rows with "id % 3 = 0".
Is the expected behaviour that row filters are applied before column masks?


http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@1
PS2, Line 1: ====
Can you add a test where the row filter references a column not in the table. it should fail with an analysis exception. There's an extra edge case to check here where it isn't in the table, but could be a correlated reference to an outer query.

E.g. This is a valid query, but I think test_id = id would be an invalid row filter on alltypes.

  select * from jointbl
  where exists(select * from alltypes where test_id = id);



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 12 Feb 2021 00:14:16 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 7:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6967/


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sun, 14 Mar 2021 08:02:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 4:

(8 comments)

Thank Csaba's review! Addressed the comments.

http://gerrit.cloudera.org:8080/#/c/16976/3/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/16976/3/be/src/common/global-flags.cc@344
PS3, Line 344:     "increm
> optional: I think that if we only use this in Java, then it is better to pl
Good point. Move this and 'enable_column_masking' to backend-gflag-util.cc. I think lots of FE-only flags are defined here. We need to follow the new rule for future FE-only flags.


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@819
PS3, Line 819: uilder.grantOption();
             :         }
> potential simplification: line 836 uses a very similar condition - can you 
Sure. I did some refactoring to make this resolveTableRef() function shorter.


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@845
PS3, Line 845: Table sho
> IMPALA-10482 uses the word "unrelative". I am not sure which one is better.
Oops... Let's use 'non-relative' which is consistent to an existing comment here: https://github.com/apache/impala/blob/5baadd1da7d554ea3446e2a025afe8e991339765/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java#L498


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java@96
PS3, Line 96: needsRowFilter
> I think that needsRowFiltering would be clearer.
Done


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@421
PS3, Line 421:     LOG.info("dbName: {}, tableName: {}, rowFilter: {}", dbName, tableName, filter);
> Looks like temporary logging.
It's intended. We have a similar log at line 402 for column masking. With them we can know which filter is choosed, which is helpful to understand the query results. Without them we can only explain the query and find the filters in the plan.

Is it ok for you to keep them?


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java@104
PS3, Line 104: supportsTableMasking
> I don't understand why was the name changed - it returns isColumnMaskingEna
Oops, we should also check row filtering as well.


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@527
PS3, Line 527:           "        }\n" +
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@527
PS3, Line 527:           "        }\n" +
> line too long (92 > 90)
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Mar 2021 03:41:51 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8364/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 15 Mar 2021 01:29:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 10:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8382/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Mar 2021 08:28:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Fang-Yu Rao, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16976

to look at the new patch set (#4).

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
22 files changed, 855 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/4
-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Fang-Yu Rao (Code Review)" <ge...@cloudera.org>.
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 3:

(9 comments)

Thanks Quanlong for adding the support for row-filtering in Impala! Sorry that it took me quite a while to review this patch. I briefly reviewed the added end-to-end tests and they look good to me. I only have some very minor comments.

I plan to read/study the parts related to audit logs generation and will try to get back to you soon the coming week.

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java@56
PS1, Line 56:    * Stash and deduplicate the audit events produced by table masking (Column-masking /
> Thanks Quanlong! I will review this and try to get back to you later today.
Thanks Quanlong! Maybe we could also add here a statement like "Refer to IMPALA-9597 for further details" so that people interested in this method could refer to there. I do not have any other suggestion. :-)


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java@56
PS3, Line 56:   * Stash and deduplicate the audit events produced by table masking (Column-masking /
            :    * Row-filtering) which are performed during the analyze phase. Called at the end of
            :    * analyzing. These stashed events will be added back after the query pass the
            :    * authorization phase. Note that normal events (select, insert, drop, etc.) are
            :    * produced in the authorization phase. Stashing table masking events avoids exposing
            :    * them when the query fails authorization.
Thanks Quanlong!

Maybe we could also add here a statement like "Refer to IMPALA-9597 for further details" so that people interested in this method could refer to there as well. I do not have any other suggestion. :-)


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test@16
PS3, Line 16: # Column-masking policies of functional.alltypes mask "id" to "-id" and redact column
            : # "date_string_col". Row-filtering policy of functional.alltypes_view keeps rows with
            : # "id >= -8 and date_string_col = 'nn/nn/nn'"
Thanks Quanlong for adding this test case!

Maybe it would be even clearer to emphasize here that the view 'alltypes_view' is based on the table 'alltypes' and thus the column masking policies were applied to 'alltypes' before the row-filter policies are applied to 'alltypes_view'.


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@76
PS3, Line 76: support using subquery
Maybe it would be clearer to emphasize that the subquery is on the same table, i.e., 'functional', in order to contrast with the next test case.


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@129
PS3, Line 129: select * from functional_parquet.complextypestbl.nested_struct.b
Maybe it would be good to briefly mention here that what the query is to fetch the desired result set if there is no row-filtering policy involved.


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@136
PS3, Line 136: # The above query should be manually rewritten to this until IMPALA-10484 is resolved.
Thanks for adding this test case!


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@146
PS3, Line 146: ====
I was wondering if it would be good to add a test case involving 2 tables each associated with a row-filtering policy. For example, 'where id % 2 = 0' for the table 'functional.alltypestiny' and 'where id % 3 = 0' for the table 'functional.alltypessmall'


http://gerrit.cloudera.org:8080/#/c/16976/3/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/16976/3/tests/authorization/test_ranger.py@1095
PS3, Line 1095:     user = getuser()
It may be good to add a comment here to point out that we do not need to additionally grant the SELECT privileges to 'getuser()' because 'getuser()' is the owner of the databases used in the test queries.

This test also shows that the a row-filtering policy takes precedence over the ownership of the resources involved in the policy. A user cannot access some rows in a table if there is some row-filtering policy applied to the table even though this user is the owner of the table. Is my understanding correct?


http://gerrit.cloudera.org:8080/#/c/16976/3/tests/authorization/test_ranger.py@1104
PS3, Line 1104: concat('0', '')
I am not very familiar with the creation of a row-filtering policy via REST API call. Is it true that simply using "string_col = '0'" does not work so that we have to use "concat('0', '')"?



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Mar 2021 00:59:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Fang-Yu Rao, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16976

to look at the new patch set (#7).

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default. Enabling row-filtering requires
--enable_column_masking=true since it depends on the column masking
implementation.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/AuthorizationUtil.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
23 files changed, 935 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/7
-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] [WIP] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: [WIP] IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8017/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Jan 2021 02:56:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Fang-Yu Rao (Code Review)" <ge...@cloudera.org>.
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java@56
PS1, Line 56:    * Stash and deduplicate the audit events produced by table masking (Column-masking /
> Done. Refactor these to be more general.
Thanks Quanlong! I will review this and try to get back to you later today.



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Feb 2021 17:14:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java
File fe/src/main/java/org/apache/impala/authorization/TableMask.java:

http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java@109
PS9, Line 109:     String stmtSql = String.format("SELECT * FROM %s.%s WHERE %s",
> Note that only the WHERE clause is returned. So "SELECT *" is used for simplicity.

That's a good point...but in that case could we just say SELECT 1  ?  Not a big deal. 
Right, I was thinking about IMPALA-9223 in my comment above.


http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java@111
PS9, Line 111:     SelectStmt selectStmt = (SelectStmt) Parser.parse(stmtSql);
> Yes, good question. For example if table functional.alltypestiny has an ill
I would think that the admin user who has access to the row filter definition should see the parse exception details such that he can fix it but for the normal user there should be some level of obscuring of the error message ..otherwise it would reveal the filter details and in any case the normal user won't be able to fix it by himself.
BTW, in the above execution I assume you were the privileged user.  But it sounds like the behavior would have been the same if you were not. 
I am ok if you want to create an enhancement JIRA for this and see what makes sense for Impala. It is probably not a blocker for the feature.  I think the Hive message is also problematic ..it just happened to show less for this case but what if the row filter subquery had some syntax issue ?

EDIT: I see later that you mention both Impala and Hive will show the full rewritten query in the EXPLAIN plan and the query profile.  This contains the expanded row filter even for the normal user .. so I am not sure if the parse exception is that relevant.  I would be curious to know how other products like Spark deal with such things.  The row filter could have things like Salary or Social Security number predicate etc which should ideally be hidden.



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Mar 2021 16:01:21 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6963/


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sat, 13 Mar 2021 09:35:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 11: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Mar 2021 15:25:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Aman Sinha, Fang-Yu Rao, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16976

to look at the new patch set (#10).

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default. Enabling row-filtering requires
--enable_column_masking=true since it depends on the column masking
implementation.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/AuthorizationUtil.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
23 files changed, 1,005 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/10
-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8263/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Mar 2021 06:11:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Fang-Yu Rao, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16976

to look at the new patch set (#5).

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
22 files changed, 928 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/5
-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16976

to look at the new patch set (#2).

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
22 files changed, 670 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/2
-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6961/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 12 Mar 2021 13:57:47 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Aman Sinha, Fang-Yu Rao, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16976

to look at the new patch set (#11).

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default. Enabling row-filtering requires
--enable_column_masking=true since it depends on the column masking
implementation.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/AuthorizationUtil.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
23 files changed, 1,005 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/11
-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 3:

(6 comments)

I went through the implementation but didn't look too much at the tests. Will do another pass soon.

http://gerrit.cloudera.org:8080/#/c/16976/3/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/16976/3/be/src/common/global-flags.cc@344
PS3, Line 344: DEFINE_bool
optional: I think that if we only use this in Java, then it is better to place the define in backend-gflag-util.cc

The reason why I did this for SAML related configs is that #include "kudu/util/flag_tags.h" caused link problems when it was included in global-flags.cc. Even if we don't use tagging, it makes more sense to me to keep only really global flags in this global-flags.cc


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@819
PS3, Line 819: doTableMasking || !getAuthzFactory().getAuthorizationConfig().isEnabled()
             :           || !getAuthzFactory().supportsTableMasking()
potential simplification: line 836 uses a very similar condition - can you move it to a common bool variable? e.g. doTableMasking = doTableMasking &&  getAuthzFactory().getAuthorizationConfig().isEnabled() && getAuthzFactory().supportsTableMasking()


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@845
PS3, Line 845: unrelated
IMPALA-10482 uses the word "unrelative". I am not sure which one is better.


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java@96
PS3, Line 96: needsFiltering
I think that needsRowFiltering would be clearer.


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@421
PS3, Line 421:     LOG.info("dbName: {}, tableName: {}, rowFilter: {}", dbName, tableName, filter);
Looks like temporary logging.


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java@104
PS3, Line 104: supportsTableMasking
I don't understand why was the name changed - it returns isColumnMaskingEnabled(), which is specific to column masking



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Feb 2021 14:41:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6969/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 15 Mar 2021 01:09:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java
File fe/src/main/java/org/apache/impala/authorization/TableMask.java:

http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java@109
PS9, Line 109:     // Parse the row filter string to AST by using it in a fake query.
> > Note that only the WHERE clause is returned. So "SELECT *" is used for si
Done. I think "SELECT 1" is the same complexity since we just do parsing (not analyzing) here. But maybe "SELECT 1" can avoid the above confusion.


http://gerrit.cloudera.org:8080/#/c/16976/9/fe/src/main/java/org/apache/impala/authorization/TableMask.java@111
PS9, Line 111:     SelectStmt selectStmt = (SelectStmt) Parser.parse(stmtSql);
> I would think that the admin user who has access to the row filter definition should see the parse exception details such that he can fix it but for the normal user there should be some level of obscuring of the error message ..otherwise it would reveal the filter details and in any case the normal user won't be able to fix it by himself.

No, the admin user can't see the error if the policy doesn't apply to him/her. Usually the policy is set for normal users. I think the process of adding a row filter should be:
1) Admin user adds a new table containing sensitive data.
2) Admin user adds a row-filtering policy for it in Ranger Admin WebUI. Note that Ranger treates the row filter as a string and won't validate it. The admin user should also choose what roles/groups/users the policy applies to.
3) Admin user switchs to target users and run some queries on the table. Verify that the row filter works as expected. At this point, a detailed error exposing the row filter helps to debug the malformed row filter string.
4) Admin user publishs the table (notify the target users about this new table).
5) If the normal users encounter any errors, they should reach out to the admin user for help. They usually don't have permission to edit the row filter policy.

> BTW, in the above execution I assume you were the privileged user.  But it sounds like the behavior would have been the same if you were not.

Good point! I think this is a problem that we resolve the column/row masking before privilege checks. I'll file a follow-up JIRA for this.

> I think the Hive message is also problematic ..it just happened to show less for this case but what if the row filter subquery had some syntax issue ?

Here's another example for syntax issue. Adding a row filter "id = (select from functional.alltypessmall)" on table functional.alltypestiny for my username, then launch beeline

 0: jdbc:hive2://localhost:11050> select * from functional.alltypestiny;
 Error: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.parse.ParseException: line 1:487 cannot recognize input near '(' 'select' 'from' in expression specification (state=42000,code=40000)

You can play around with Hive by launching it using "testdata/bin/run-hive-server.sh -with_ranger".

> EDIT: I see later that you mention both Impala and Hive will show the full rewritten query in the EXPLAIN plan and the query profile.  This contains the expanded row filter even for the normal user .. so I am not sure if the parse exception is that relevant.  I would be curious to know how other products like Spark deal with such things.  The row filter could have things like Salary or Social Security number predicate etc which should ideally be hidden.

Yeah, I will do some investigations.



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Mar 2021 00:16:03 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6961/


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 12 Mar 2021 19:33:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16976/4/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/16976/4/tests/authorization/test_ranger.py@1124
PS4, Line 1124: #
flake8: E266 too many leading '#' for block comment


http://gerrit.cloudera.org:8080/#/c/16976/4/tests/authorization/test_ranger.py@1169
PS4, Line 1169: #
flake8: E266 too many leading '#' for block comment



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Mar 2021 03:40:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 7:

> Patch Set 6: Verified-1
> 
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6961/

Updated the patch to not allow enabling row-filtering but disabling column masking.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sat, 13 Mar 2021 03:15:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 12: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 12
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Mar 2021 21:08:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6967/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sun, 14 Mar 2021 01:45:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Fang-Yu Rao, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16976

to look at the new patch set (#3).

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
22 files changed, 710 insertions(+), 66 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/3
-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 8: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 15 Mar 2021 07:00:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8089/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Feb 2021 13:00:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default. Enabling row-filtering requires
--enable_column_masking=true since it depends on the column masking
implementation.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
Reviewed-on: http://gerrit.cloudera.org:8080/16976
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/AuthorizationUtil.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
23 files changed, 1,005 insertions(+), 113 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 13
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 4:

> Patch Set 3:
> 
> (9 comments)
> 
> Thanks Quanlong for adding the support for row-filtering in Impala! Sorry that it took me quite a while to review this patch. I briefly reviewed the added end-to-end tests and they look good to me. I only have some very minor comments.
> 
> I plan to read/study the parts related to audit logs generation and will try to get back to you soon the coming week.

Oops, just realized I missed this when uploading PS4. Will address your comments in the next patch. Thank Fang-Yu!


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Mar 2021 03:42:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 11: Code-Review+2

I feel comfortable with the current patch to move ahead.  There's good discussion here and possible follow-up enhancements but it makes sense to get this core patch in.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Mar 2021 00:24:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] [WIP] IMPALA-9234: Support Ranger row filtering policies

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: [WIP] IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16976/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16976/1//COMMIT_MSG@33
PS1, Line 33:    place
I think we need to add some more tests for different row filter types, e.g. more complex expressions, invalid expressions, etc.

Are there any restrictions on the types of expressions that can be used in row filters? E.g. are UDFs allowed?


http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/TableMask.java
File fe/src/main/java/org/apache/impala/authorization/TableMask.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/TableMask.java@102
PS1, Line 102:     String stmtSql = String.format("SELECT * FROM %s.%s WHERE %s",
We might need to be a bit careful with testing invalid row filters that don't parse, etc. It seems like this approach should work, but just trying to think about the risks.


http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@397
PS1, Line 397:     List<AuthzAuditEvent> auditEvents = auditHandler.getAuthzEvents();
Can we factor out this code and share it with the similar logic above? This code seems subtle and it would be good to only have one implementation.


http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java@56
PS1, Line 56:    * We stash the List of AuthzAuditEvent's in a Map after the analysis of the query and
This comment probably needs an update. I find this method a bit confusing overall. Maybe it would be clearer if the comment described what it means to call this method? Or when it would be called? The comment seems to mainly discuss the implementation.



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 27 Jan 2021 20:42:27 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@527
PS3, Line 527:         assertEventEquals("@column", "select", "functional/alltypestiny/date_string_col", 1,
line too long (92 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 19 Feb 2021 09:39:43 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 7:

> Patch Set 7: Verified-1
> 
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6963/

Failed by a falky test: IMPALA-10559


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sun, 14 Mar 2021 01:45:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 12: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 12
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Mar 2021 15:26:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 7:

> Patch Set 7:
> 
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6967/

Hit IMPALA-10559 again...


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sun, 14 Mar 2021 12:22:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 11:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8386/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Mar 2021 00:35:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@524
PS2, Line 524:         assertEventEquals("@column", "select", "functional/alltypestiny/date_string_col", 1,
line too long (92 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Feb 2021 12:42:35 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6963/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sat, 13 Mar 2021 03:17:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
File fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@175
PS2, Line 175:     // TODO(IMPALA-10483): Column-masking/Row-filtering expressions may have subqueries
Done. Added IMPALA-10483 in the comment.

> I think we might also have to be careful about exactly what subqueries are allowed. E.g. if would be weird if you had a correlated references between the row filter subquery and the outer query, or if you could have a relative table reference that referred to different tables depending on which DB it was executed in.

Yeah, currently we depend on users to verify the rules. I think Hive does so. Copy this to comments of IMPALA-10483 so we can take care of it once we support subquery filters.


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@473
PS2, Line 473: given tables. This is used 
> This is inaccurate - this method doesn't check whether row filtering is ena
Done


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@476
PS2, Line 476: throwIfRowFilterin
> Rename this method to reflect new behaviour, e.g. throwIfRowFilteringRequir
Done. Also rename the authorizeColumnMask() method name.


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java@139
PS2, Line 139:       // TODO: Whether we should use lowercase or uppercase accessType?
> How would we decide this? What does Hive do?
AccessTypes (e.g. SELECT, INSERT, MASK, ROW_FILTER) in Hive audit events are all in uppercase. They inherit from some constant strings witch are uppercase, e.g. result.getMaskType().

However, AccessTypes in Impala audit events are always converted to lowercase. I think it's not neccessary. But not sure if changing them to uppercase will break other tools, which should be considered as a breaking change.

Note that the AccessType field is only used for observerbility.


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@454
PS2, Line 454:     // Two row filter policies will be added. The first one affects 'user_' and keeps rows
> Add a brief comment describing the row filter policies added.
Done


http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test@3
PS2, Line 3: # Row-filtering policy keeps rows with "id % 3 = 0".
> Is the expected behaviour that row filters are applied before column masks?
Yes, column masking applies on result data. So the row filtering happens first. I mention this in the commit message. Also copied it here for readability.


http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@1
PS2, Line 1: ====
> Can you add a test where the row filter references a column not in the tabl
Good point! But this corner case won't happen, because 'alltypes' will be replaced by an inline view and the filter is added inside the view. The unresolved column won't be resolved as correlated reference.

To be specifit, let's say table 'alltypes' has a row filter policy of "test_id = id". Table 'jointbl' has a column 'test_id' which doesn't exist in 'alltypes'. The query

  select * from jointbl
  where exists(select * from alltypes)

will be analyzed as

  select * from jointbl
  where exists(select * from
      (select id, bool_col, ..., month from alltypes where test_id = id) v
  )

and will hit an error in resolving 'test_id'.

Added a similar test to cover this.



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 19 Feb 2021 09:39:13 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 4:

(11 comments)

Addressed Fang-Yu's comment. PS4 and PS5 both add more tests.

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java@56
PS1, Line 56:    * Stash and deduplicate the audit events produced by table masking (Column-masking /
> Thanks Quanlong! Maybe we could also add here a statement like "Refer to IM
Done


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java@56
PS3, Line 56:   * Stash and deduplicate the audit events produced by table masking (Column-masking /
            :    * Row-filtering) which are performed during the analyze phase. Called at the end of
            :    * analyzing. These stashed events will be added back after the query pass the
            :    * authorization phase. Note that normal events (select, insert, drop, etc.) are
            :    * produced in the authorization phase. Stashing table masking events avoids exposing
            :    * them when the query fails authorization.
> Thanks Quanlong!
Done


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test@16
PS3, Line 16: # Column-masking policies of functional.alltypes mask "id" to "-id" and redact column
            : # "date_string_col". Row-filtering policy of functional.alltypes_view keeps rows with
            : # "id >= -8 and date_string_col = 'nn/nn/nn'"
> Thanks Quanlong for adding this test case!
Done


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
File testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@76
PS3, Line 76: 
> Maybe it would be clearer to emphasize that the subquery is on the same tab
Done


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@129
PS3, Line 129: ---- RESULTS
> Maybe it would be good to briefly mention here that what the query is to fe
Done


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@136
PS3, Line 136: BIGINT,INT,INT
> Thanks for adding this test case!
Done


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@146
PS3, Line 146: 8,-1
> I was wondering if it would be good to add a test case involving 2 tables e
Sure, done. I reused the existing filters and added the tests at line 23 in PS5.


http://gerrit.cloudera.org:8080/#/c/16976/3/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/16976/3/tests/authorization/test_ranger.py@1095
PS3, Line 1095:           policy_names.append(policy_name)
> It may be good to add a comment here to point out that we do not need to additionally grant the SELECT privileges to 'getuser()' because 'getuser()' is the owner of the databases used in the test queries.

Done

> A user cannot access some rows in a table if there is some row-filtering policy applied to the table even though this user is the owner of the table. Is my understanding correct?

Yeah, this is the behavior by designed.


http://gerrit.cloudera.org:8080/#/c/16976/3/tests/authorization/test_ranger.py@1104
PS3, Line 1104: e_policy(policy
> I am not very familiar with the creation of a row-filtering policy via REST
No, in PS1 I use string = '0' and it's ok. This is just for making sure function calls can be used in the filter. I'll add a comment for it.


http://gerrit.cloudera.org:8080/#/c/16976/4/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/16976/4/tests/authorization/test_ranger.py@1124
PS4, Line 1124: #
> flake8: E266 too many leading '#' for block comment
Done


http://gerrit.cloudera.org:8080/#/c/16976/4/tests/authorization/test_ranger.py@1169
PS4, Line 1169: #
> flake8: E266 too many leading '#' for block comment
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Mar 2021 05:52:28 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8169/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 19 Feb 2021 10:08:07 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Fang-Yu Rao, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16976

to look at the new patch set (#8).

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default. Enabling row-filtering requires
--enable_column_masking=true since it depends on the column masking
implementation.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/AuthorizationUtil.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
23 files changed, 935 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/8
-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8262/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Mar 2021 03:58:35 +0000
Gerrit-HasComments: No