You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ranger.apache.org by "Barbara Eckman (Jira)" <ji...@apache.org> on 2022/02/07 18:26:00 UTC

[jira] [Commented] (RANGER-3525) Clarify handling of column masks on nested types

    [ https://issues.apache.org/jira/browse/RANGER-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488323#comment-17488323 ] 

Barbara Eckman commented on RANGER-3525:
----------------------------------------

In creating masking for hive tables, only nullify may be used if the column is of type struct, since the masking approaches assume that the column type is string or numeric. In tag-based policies, where the type of the column is not known until run time, this means in effect that only the nullify option may be used for masking policies on any hive object. While it's good that one of the masking methods works on structs, nullify is typically not the users' first choice, since null is a valid value for many columns, and thus it's not clear whether or not a value has been masked. In addition, null doesn't support joins on the masked columns.

So I agree with the reporter of this ticket that we need a resolution.

> Clarify handling of column masks on nested types
> ------------------------------------------------
>
>                 Key: RANGER-3525
>                 URL: https://issues.apache.org/jira/browse/RANGER-3525
>             Project: Ranger
>          Issue Type: Task
>          Components: Ranger
>            Reporter: Csaba Ringhofer
>            Priority: Major
>
> Apache Hive and Impala supports nested types (aka complex types), for example array<int>, map<int,string> or struct<a:int, b:int>. The inner types of these are accessed with column names containing "." characters, e.g outer_struct.inner_struct.scalar_member.
> It is not clear what should we do when a column mask is found on a nested type, e.g. struct_col.member or array_col.item. Ranges allows adding policies on column names with "." in them, but both and Apache Hive and Apache Impala  ignores these policies - I do not know about other engines.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)