You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/03/17 09:12:00 UTC

[jira] [Commented] (IMPALA-9661) Avoid introducing unused columns in table masking view

    [ https://issues.apache.org/jira/browse/IMPALA-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303207#comment-17303207 ] 

Quanlong Huang commented on IMPALA-9661:
----------------------------------------

One solution is adding a statement rewrite rule to remove redundant columns introduced by table mask view. We can register the actually referenced columns after analyze(). Then remove the unreferenced ones in rewriter.

> Avoid introducing unused columns in table masking view
> ------------------------------------------------------
>
>                 Key: IMPALA-9661
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9661
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>
> If a table has column masking policies, we replace its unanalyzed TableRef with an analyzed InlineViewRef (table masking view) in FromClause.analyze(). However, we can't detect which columns are actually used in the original query at this point. In fact, analyze() for SelectList, WhereClause, GroupByClause and other clauses containing SlotRefs happen after FromClause.analyze(). After the whole query block is analyzed, we can get the exact set of required columns. We should do table masking there to avoid introducing unused columns.
> To be specifit, if table _tbl_(_id_ int, _name_ string, _address_ string) has column masking policies for column _name_ and _address_ to mask them, the following query
> {code:sql}
> select name from tbl where id > 10;
> {code}
> will be rewritten to
> {code:sql}
> select name from (
>   select id, mask(name) as name, mask(address) as address from tbl
> ) tbl where id > 10;
> {code}
> The rewritten query introduce the requirement for SELECT privilege on the _address_ column which isn't required by the original query. We should either fix this or IMPALA-9223.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org