You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sentry.apache.org by "Colin Ma (JIRA)" <ji...@apache.org> on 2015/04/24 09:39:38 UTC

[jira] [Updated] (SENTRY-565) Improvement the performance when Sentry filter the entity

     [ https://issues.apache.org/jira/browse/SENTRY-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Colin Ma updated SENTRY-565:
----------------------------
    Attachment: SENTRY-565.002.patch

The patch is rebased, and the following is the result for the performance test with HiveMetaStoreClient.getTables(dbName, "*"):
||num of tables||execute time(with patch)||execute time(without patch)||
|1000|85ms|8463ms|
|5000|214ms|42221ms|
|10000|332ms|79462ms|
The test is based on project's e2e test, If hive and sentry are installed in different server, I think the execute time will be more than above result. 

> Improvement the performance when Sentry filter the entity
> ---------------------------------------------------------
>
>                 Key: SENTRY-565
>                 URL: https://issues.apache.org/jira/browse/SENTRY-565
>             Project: Sentry
>          Issue Type: Improvement
>            Reporter: Colin Ma
>            Assignee: Colin Ma
>         Attachments: SENTRY-565.001.patch, SENTRY-565.002.patch
>
>
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> {code}
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> {code}
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)