You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/02/10 21:37:10 UTC

[GitHub] [hive] saihemanth-cloudera opened a new pull request #1970: HIVE-24769: Included owner information included in Table object in Hi…

saihemanth-cloudera opened a new pull request #1970:
URL: https://github.com/apache/hive/pull/1970


   …veMetaStoreClient#getTables()
   
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/Hive/HowToContribute
     2. Ensure that you have created an issue on the Hive project JIRA: https://issues.apache.org/jira/projects/HIVE/summary
     3. Ensure you have added or run the appropriate tests for your PR: 
     4. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]HIVE-XXXXX:  Your PR title ...'.
     5. Be sure to keep the PR description updated to reflect all changes.
     6. Please write your PR title to summarize what this PR proposes.
     7. If possible, provide a concise example to reproduce the issue for a faster review.
   
   -->
   
   ### What changes were proposed in this pull request?
   Table object has table owner information so that authorization can be done on these objects.
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   Otherwise, the "show tables" command in beeline shows all the tables in db instead of the user-owned tables. By having this info, we can filter out tables that are not owned by the user.
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description, screenshot and/or a reproducable example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Hive versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   Local Machine, Remote cluster.
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] saihemanth-cloudera closed pull request #1970: HIVE-24769: Included owner information in the Table object in Hi…

Posted by GitBox <gi...@apache.org>.
saihemanth-cloudera closed pull request #1970:
URL: https://github.com/apache/hive/pull/1970


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] nrg4878 commented on a change in pull request #1970: HIVE-24769: Included owner information in the Table object in Hi…

Posted by GitBox <gi...@apache.org>.
nrg4878 commented on a change in pull request #1970:
URL: https://github.com/apache/hive/pull/1970#discussion_r609665616



##########
File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##########
@@ -2435,8 +2435,24 @@ public Type getType(String name) throws NoSuchObjectException, MetaException, TE
   @Override
   public List<String> getTables(String catName, String dbName, String tablePattern)
       throws TException {
-    List<String> tables = client.get_tables(prependCatalogToDbName(catName, dbName, conf), tablePattern);
-    return FilterUtils.filterTableNamesIfEnabled(isClientFilterEnabled, filterHook, catName, dbName, tables);
+    List<String> tables = new ArrayList<>();
+    GetProjectionsSpec projectionsSpec = new GetProjectionsSpec();
+    List<String> projectedFields = Arrays.asList("dbName", "tableName", "owner", "ownerType");
+    projectionsSpec.setFieldList(projectedFields);
+    GetTablesRequest req = new GetTablesRequest(dbName);
+    req.setCatName(catName);
+    req.setCapabilities(version);
+    req.setTblNames(tables);
+    req.setTablesPattern(tablePattern);
+    if (processorCapabilities != null)

Review comment:
       Thanks for adding the processorIdentifiier.

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##########
@@ -1853,15 +1853,25 @@ private MTable getMTable(String catName, String db, String table) {
       db = normalizeIdentifier(db);
       catName = normalizeIdentifier(catName);
 
-      List<String> lowered_tbl_names = new ArrayList<>(tbl_names.size());
-      for (String t : tbl_names) {
-        lowered_tbl_names.add(normalizeIdentifier(t));
+      List<String> lowered_tbl_names = new ArrayList<>();
+      if(tbl_names != null) {
+        lowered_tbl_names = new ArrayList<>(tbl_names.size());
+        for (String t : tbl_names) {
+          lowered_tbl_names.add(normalizeIdentifier(t));
+        }
       }
 
-      query = pm.newQuery(MTable.class);
-      query.setFilter("database.name == db && database.catalogName == cat && tbl_names.contains(tableName)");
-      query.declareParameters("java.lang.String db, java.lang.String cat, java.util.Collection tbl_names");
-
+      StringBuilder filterBuilder = new StringBuilder();
+      List<String> parameterVals = new ArrayList<>();
+      appendSimpleCondition(filterBuilder, "database.name", new String[] {db}, parameterVals);
+      appendSimpleCondition(filterBuilder, "database.catalogName", new String[] {catName}, parameterVals);
+      if(tbl_names != null){

Review comment:
       We should add a comment to explain the strategy on how this API uses the input arguments.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] nrg4878 commented on a change in pull request #1970: HIVE-24769: Included owner information in the Table object in Hi…

Posted by GitBox <gi...@apache.org>.
nrg4878 commented on a change in pull request #1970:
URL: https://github.com/apache/hive/pull/1970#discussion_r597840626



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##########
@@ -1854,13 +1854,21 @@ private MTable getMTable(String catName, String db, String table) {
       catName = normalizeIdentifier(catName);
 
       List<String> lowered_tbl_names = new ArrayList<>(tbl_names.size());
-      for (String t : tbl_names) {
-        lowered_tbl_names.add(normalizeIdentifier(t));
+      if(tbl_names != null) {
+        for (String t : tbl_names) {
+          lowered_tbl_names.add(normalizeIdentifier(t));
+        }
       }
 
       query = pm.newQuery(MTable.class);
-      query.setFilter("database.name == db && database.catalogName == cat && tbl_names.contains(tableName)");
-      query.declareParameters("java.lang.String db, java.lang.String cat, java.util.Collection tbl_names");
+
+      if(tablePattern == null) {

Review comment:
       So this is a bit confusing as we discussed. A cleaner and less confusing approach will be to honor both the tbl_names and tablePattern. So you will have 3 scenarios
   1) tbl_names not empty but tablePattern is null: Old behavior that will fetch just the objects for the tables provided.
   2) tbl_names is empty but tablePattern is not null: The behavior you are attempting to introduce where HMS will fetch all table object for names matching the tablePattern.
   3) tbl_names not empty and tablePattern not null: There is no usage scenario in the hive code base currently but this will return table objects for only those tables in the tbl_names whose names match the tablePattern. Others names in the list will be omitted.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] nrg4878 commented on pull request #1970: HIVE-24769: Included owner information in the Table object in Hi…

Posted by GitBox <gi...@apache.org>.
nrg4878 commented on pull request #1970:
URL: https://github.com/apache/hive/pull/1970#issuecomment-818104274


   This is been committed to master. Please close the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] nrg4878 commented on a change in pull request #1970: HIVE-24769: Included owner information in the Table object in Hi…

Posted by GitBox <gi...@apache.org>.
nrg4878 commented on a change in pull request #1970:
URL: https://github.com/apache/hive/pull/1970#discussion_r601701019



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##########
@@ -3585,31 +3585,35 @@ public GetTablesResult get_table_objects_by_name_req(GetTablesRequest req) throw
       if (dbName == null || dbName.isEmpty()) {
         throw new UnknownDBException("DB name is null or empty");
       }
-      if (tableNames == null) {
-        throw new InvalidOperationException(dbName + " cannot find null tables");
-      }
-
-      // The list of table names could contain duplicates. RawStore.getTableObjectsByName()
-      // only guarantees returning no duplicate table objects in one batch. If we need
-      // to break into multiple batches, remove duplicates first.
-      List<String> distinctTableNames = tableNames;
-      if (distinctTableNames.size() > tableBatchSize) {
-        List<String> lowercaseTableNames = new ArrayList<>();
-        for (String tableName : tableNames) {
-          lowercaseTableNames.add(org.apache.hadoop.hive.metastore.utils.StringUtils.normalizeIdentifier(tableName));
+      RawStore ms = getMS();
+      if(tablePattern != null){

Review comment:
       This if condition seems unnecessary now given that we support both tableNames and tablePattern if provided. Just the code in the else block seems sufficient. No?

##########
File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##########
@@ -2435,8 +2435,24 @@ public Type getType(String name) throws NoSuchObjectException, MetaException, TE
   @Override
   public List<String> getTables(String catName, String dbName, String tablePattern)
       throws TException {
-    List<String> tables = client.get_tables(prependCatalogToDbName(catName, dbName, conf), tablePattern);
-    return FilterUtils.filterTableNamesIfEnabled(isClientFilterEnabled, filterHook, catName, dbName, tables);
+    List<String> tables = new ArrayList<>();
+    GetProjectionsSpec projectionsSpec = new GetProjectionsSpec();
+    List<String> projectedFields = Arrays.asList("dbName", "tableName", "owner", "ownerType");
+    projectionsSpec.setFieldList(projectedFields);
+    GetTablesRequest req = new GetTablesRequest(dbName);
+    req.setCatName(catName);
+    req.setCapabilities(version);
+    req.setTblNames(tables);
+    req.setTablesPattern(tablePattern);
+    if (processorCapabilities != null)

Review comment:
       processorIdentifier also needs to be passed in if set.

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##########
@@ -1896,7 +1917,12 @@ private MTable getMTable(String catName, String db, String table) {
           }
         } else if (projectionFields.size() == 1) {
           // Execute the query to fetch the partial results.
-          List<Object> results = (List<Object>) query.execute(db, catName, lowered_tbl_names);
+          List<Object[]> results = new ArrayList<>();
+          if(tablePattern == null) {

Review comment:
       Shouldn't there be 3 possiblities here?
   tablePattern != null && !lowered_tbl_names.isEmpty() --> execute() needs 4 arguments
   lowered_tbl_names.isEmpty() && tablePattern != null --> execute() needs 3 arguments
   !lowered_tbl_names.isEmpty() && tablePattern == null --> execute() needs 3 arguments

##########
File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##########
@@ -2435,8 +2435,24 @@ public Type getType(String name) throws NoSuchObjectException, MetaException, TE
   @Override
   public List<String> getTables(String catName, String dbName, String tablePattern)
       throws TException {
-    List<String> tables = client.get_tables(prependCatalogToDbName(catName, dbName, conf), tablePattern);
-    return FilterUtils.filterTableNamesIfEnabled(isClientFilterEnabled, filterHook, catName, dbName, tables);
+    List<String> tables = new ArrayList<>();
+    GetProjectionsSpec projectionsSpec = new GetProjectionsSpec();
+    List<String> projectedFields = Arrays.asList("dbName", "tableName", "owner", "ownerType");

Review comment:
       We should use GetTableProjectionSpecBuilder instead of hard coding the fieldnames. builder.includeDatabase().includeTableName().includeOwner().includeOwnerType()

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##########
@@ -1877,11 +1889,20 @@ private MTable getMTable(String catName, String db, String table) {
       }
 
       if (projectionFields == null) {
-        mtables = (List<MTable>) query.execute(db, catName, lowered_tbl_names);
+        if(tablePattern == null) {
+          mtables = (List<MTable>) query.execute(db, catName, lowered_tbl_names);
+        }else{
+          mtables = (List<MTable>) query.execute(db, catName, tablePattern);

Review comment:
       Shouldn't there be 3 possiblities here?
   tablePattern != null && !lowered_tbl_names.isEmpty() --> execute() needs 4 arguments
   lowered_tbl_names.isEmpty() && tablePattern != null --> execute() needs 3 arguments
   !lowered_tbl_names.isEmpty() && tablePattern == null --> execute() needs 3 arguments

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##########
@@ -1853,14 +1853,26 @@ private MTable getMTable(String catName, String db, String table) {
       db = normalizeIdentifier(db);
       catName = normalizeIdentifier(catName);
 
-      List<String> lowered_tbl_names = new ArrayList<>(tbl_names.size());
-      for (String t : tbl_names) {
-        lowered_tbl_names.add(normalizeIdentifier(t));
+      List<String> lowered_tbl_names = new ArrayList<>();
+      if(tbl_names != null) {
+        lowered_tbl_names = new ArrayList<>(tbl_names.size());
+        for (String t : tbl_names) {
+          lowered_tbl_names.add(normalizeIdentifier(t));
+        }
       }
 
       query = pm.newQuery(MTable.class);
-      query.setFilter("database.name == db && database.catalogName == cat && tbl_names.contains(tableName)");
-      query.declareParameters("java.lang.String db, java.lang.String cat, java.util.Collection tbl_names");
+
+      if(tbl_names != null && !tbl_names.isEmpty() && tablePattern != null) {

Review comment:
       Can we simplify these conditions to something like this? 
   
   filterString = "database.name == db && database.catalogName == cat";
   filterParams = "java.lang.String db, java.lang.String cat";
   if(tbl_names != null && !tbl_names.isEmpty()) {
           filterString = filterString + " " + "&& tbl_names.contains(tableName)";
           filterParams = filterParams + " " + "java.util.Collection tbl_names";
         }
   if (tablePattern != null) {
           filterString = filterString + " " + "&& tableName.matches(tablePattern)");
           filterParams = filterParams + " " + "java.lang.String tablePattern";
         }
         

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##########
@@ -1877,11 +1889,20 @@ private MTable getMTable(String catName, String db, String table) {
       }
 
       if (projectionFields == null) {
-        mtables = (List<MTable>) query.execute(db, catName, lowered_tbl_names);
+        if(tablePattern == null) {
+          mtables = (List<MTable>) query.execute(db, catName, lowered_tbl_names);
+        }else{
+          mtables = (List<MTable>) query.execute(db, catName, tablePattern);
+        }
       } else {
         if (projectionFields.size() > 1) {
           // Execute the query to fetch the partial results.
-          List<Object[]> results = (List<Object[]>) query.execute(db, catName, lowered_tbl_names);
+          List<Object[]> results = new ArrayList<>();
+          if(tablePattern == null) {

Review comment:
       Shouldn't there be 3 possiblities here?
   tablePattern != null && !lowered_tbl_names.isEmpty() --> execute() needs 4 arguments
   lowered_tbl_names.isEmpty() && tablePattern != null --> execute() needs 3 arguments
   !lowered_tbl_names.isEmpty() && tablePattern == null --> execute() needs 3 arguments




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] saihemanth-cloudera commented on a change in pull request #1970: HIVE-24769: Included owner information in the Table object in Hi…

Posted by GitBox <gi...@apache.org>.
saihemanth-cloudera commented on a change in pull request #1970:
URL: https://github.com/apache/hive/pull/1970#discussion_r605874641



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##########
@@ -3585,31 +3585,35 @@ public GetTablesResult get_table_objects_by_name_req(GetTablesRequest req) throw
       if (dbName == null || dbName.isEmpty()) {
         throw new UnknownDBException("DB name is null or empty");
       }
-      if (tableNames == null) {
-        throw new InvalidOperationException(dbName + " cannot find null tables");
-      }
-
-      // The list of table names could contain duplicates. RawStore.getTableObjectsByName()
-      // only guarantees returning no duplicate table objects in one batch. If we need
-      // to break into multiple batches, remove duplicates first.
-      List<String> distinctTableNames = tableNames;
-      if (distinctTableNames.size() > tableBatchSize) {
-        List<String> lowercaseTableNames = new ArrayList<>();
-        for (String tableName : tableNames) {
-          lowercaseTableNames.add(org.apache.hadoop.hive.metastore.utils.StringUtils.normalizeIdentifier(tableName));
+      RawStore ms = getMS();
+      if(tablePattern != null){

Review comment:
       Then we'll not be addressing the scenario of table list can be empty and table pattern is set to some regex. We'll we returning an empty list in this case

##########
File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##########
@@ -2435,8 +2435,24 @@ public Type getType(String name) throws NoSuchObjectException, MetaException, TE
   @Override
   public List<String> getTables(String catName, String dbName, String tablePattern)
       throws TException {
-    List<String> tables = client.get_tables(prependCatalogToDbName(catName, dbName, conf), tablePattern);
-    return FilterUtils.filterTableNamesIfEnabled(isClientFilterEnabled, filterHook, catName, dbName, tables);
+    List<String> tables = new ArrayList<>();
+    GetProjectionsSpec projectionsSpec = new GetProjectionsSpec();
+    List<String> projectedFields = Arrays.asList("dbName", "tableName", "owner", "ownerType");

Review comment:
       GetTableProjectionsSpecBuilder is not importable in HiveMetaStoreClient class so we'll have to do this in this way only I guess.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org