You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ranger.apache.org by "Xuze Yang (Jira)" <ji...@apache.org> on 2022/09/24 14:48:00 UTC

[jira] [Commented] (RANGER-3685) hive 'show' sql produces excessive audit log

    [ https://issues.apache.org/jira/browse/RANGER-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609007#comment-17609007 ] 

Xuze Yang commented on RANGER-3685:
-----------------------------------

I uploaded a patch in which a configuration item "xasecure.hive.simplify.audit.of.hive.show.sql" was added to decide whether to simplify the audit log of hive show sql. When not specified, the default is false, that is, the hive audit log is not simplified, which is compatible with the previous behavior.

Please find attached patch and review PR as per your convenience. Thanks.

CC: [~madhan] [~kirbyzhou] [~kulkabhay] 

> hive 'show' sql produces excessive audit log
> --------------------------------------------
>
>                 Key: RANGER-3685
>                 URL: https://issues.apache.org/jira/browse/RANGER-3685
>             Project: Ranger
>          Issue Type: Improvement
>          Components: audit
>    Affects Versions: 2.1.0
>            Reporter: Xuze Yang
>            Priority: Major
>         Attachments: 0001-1.-hive-show-sql.patch, submit patch.pdf
>
>
> Since ranger2.1.0. For "show databases", user needs any permission on Database to get authorized. RangerHiveAuthorizer.filterListCmdObjects() is implemented to filter out the database which user don't have access to. 
> This is a good implementation, but a problem comes with it:the method will record an audit log for each database(each table for "show tables"). In our production environment, There are 80,000 tables under a database of hive. A show tables operation will generate 80001(The extra one is the verification of USE permissions) audit logs. Unfortunately, our customers will frequently call the show tables operation.
> This brings up two problems: 
>  # Valuable audit logs are flooded
>  # Take up a lot of storage resources
> For problem.2, such a scenario has occurred in our environment: our audit log destination is down. All audit logs are spooled in disk files, several days later, the size of the disk file exceeded 800G, causing other components to fail to provide services normally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)