You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/03/25 19:00:08 UTC

[jira] [Work logged] (HIVE-24705) Create/Alter/Drop tables based on storage handlers in HS2 should be authorized by Ranger/Sentry

     [ https://issues.apache.org/jira/browse/HIVE-24705?focusedWorklogId=572212&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-572212 ]

ASF GitHub Bot logged work on HIVE-24705:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Mar/21 19:00
            Start Date: 25/Mar/21 19:00
    Worklog Time Spent: 10m 
      Work Description: nrg4878 commented on a change in pull request #1960:
URL: https://github.com/apache/hive/pull/1960#discussion_r601728901



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageAuthorizationHandler.java
##########
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.metadata;
+
+import org.apache.hadoop.hive.common.classification.InterfaceAudience;
+import org.apache.hadoop.hive.common.classification.InterfaceStability;
+
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.Map;
+
+/**
+ * HiveStorageAuthorizationHandler defines a pluggable interface for
+ * authorization of storage based tables in Hive. A Storage authorization
+ * handler consists of a bundle of the following:
+ *
+ *<ul>
+ *<li>getURI
+ *</ul>
+ *
+ * Storage authorization handler classes are plugged in using the STORED BY 'classname'
+ * clause in CREATE TABLE.
+ */
+@InterfaceAudience.Public
+@InterfaceStability.Stable
+public interface HiveStorageAuthorizationHandler{

Review comment:
       nit: the name "HiveStorageAuthorizationHandler" does not seem to fit the purpose of the interface. I am just throwing out names now. WOuld something like these make more sense? HiveURIBasedAuthorization or HiveUriBasedStorageHandler or something akin to this?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageAuthorizationHandler.java
##########
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.metadata;
+
+import org.apache.hadoop.hive.common.classification.InterfaceAudience;
+import org.apache.hadoop.hive.common.classification.InterfaceStability;
+
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.Map;
+
+/**
+ * HiveStorageAuthorizationHandler defines a pluggable interface for
+ * authorization of storage based tables in Hive. A Storage authorization
+ * handler consists of a bundle of the following:
+ *
+ *<ul>
+ *<li>getURI
+ *</ul>
+ *
+ * Storage authorization handler classes are plugged in using the STORED BY 'classname'
+ * clause in CREATE TABLE.
+ */
+@InterfaceAudience.Public
+@InterfaceStability.Stable
+public interface HiveStorageAuthorizationHandler{
+
+    /**
+     * @return get URI for authentication implementation,
+     * should return uri with table properties.
+     */
+    public URI getURIForAuth(Map<String, String> tableProperties) throws URISyntaxException;
+}

Review comment:
       Nit: Add newline character on the last line.

##########
File path: kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java
##########
@@ -65,13 +68,16 @@
 /**
  * Hive Kafka storage handler to allow user to read and write from/to Kafka message bus.
  */
-@SuppressWarnings("ALL") public class KafkaStorageHandler extends DefaultHiveMetaHook implements HiveStorageHandler {
+@SuppressWarnings("ALL") public class KafkaStorageHandler extends DefaultHiveMetaHook implements HiveStorageHandler, HiveStorageAuthorizationHandler {

Review comment:
       I dont think the StorageHandlers are serializable classes. Can you confirm? If they are, then we should stamp these classes with serialVersionUID to avoid execution engines using the wrong jars.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/events/AlterTableEvent.java
##########
@@ -101,6 +112,36 @@ private HiveOperationType getOperationType() {
       ret.add(getHivePrivilegeObjectDfsUri(newUri));
     }
 
+    if(newTable.getParameters().containsKey(hive_metastoreConstants.META_TABLE_STORAGE)) {
+      String storageUri = "";
+      DefaultStorageHandler defaultStorageHandler = null;
+      HiveStorageHandler hiveStorageHandler = null;
+      Configuration conf = new Configuration();
+      Map<String, String> tableProperties = new HashMap<>();
+      tableProperties.putAll(newTable.getSd().getSerdeInfo().getParameters());

Review comment:
       I am not sure I understand this logic. Why are the SerDe.getParameters() being added to Table.getParameters()? Is this what we always do?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/events/AlterTableEvent.java
##########
@@ -101,6 +112,36 @@ private HiveOperationType getOperationType() {
       ret.add(getHivePrivilegeObjectDfsUri(newUri));
     }
 
+    if(newTable.getParameters().containsKey(hive_metastoreConstants.META_TABLE_STORAGE)) {
+      String storageUri = "";
+      DefaultStorageHandler defaultStorageHandler = null;
+      HiveStorageHandler hiveStorageHandler = null;
+      Configuration conf = new Configuration();
+      Map<String, String> tableProperties = new HashMap<>();
+      tableProperties.putAll(newTable.getSd().getSerdeInfo().getParameters());
+      tableProperties.putAll(newTable.getParameters());
+      try {
+        hiveStorageHandler = (HiveStorageHandler) ReflectionUtils.newInstance(
+                conf.getClassByName(newTable.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE)), event.getHandler().getConf());
+        Method methodIsImplemented = hiveStorageHandler.getClass().getMethod("getURIForAuth", Map.class);
+        if(methodIsImplemented != null && hiveStorageHandler instanceof DefaultStorageHandler) {

Review comment:
       isnt it enough to just check if the storagehandler class implements the new interface?
   something like this
   (hiveStorageHandler instanceof HiveStorageAuthorizationHandler)
   
   So this would be true for all the Hive implemented ones and custom handlers that dont implement this wouldnt use this URI Authorization right ?

##########
File path: kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java
##########
@@ -65,13 +68,16 @@
 /**
  * Hive Kafka storage handler to allow user to read and write from/to Kafka message bus.
  */
-@SuppressWarnings("ALL") public class KafkaStorageHandler extends DefaultHiveMetaHook implements HiveStorageHandler {
+@SuppressWarnings("ALL") public class KafkaStorageHandler extends DefaultHiveMetaHook implements HiveStorageHandler, HiveStorageAuthorizationHandler {
 
   private static final Logger LOG = LoggerFactory.getLogger(KafkaStorageHandler.class);
   private static final String KAFKA_STORAGE_HANDLER = "org.apache.hadoop.hive.kafka.KafkaStorageHandler";
 
   private Configuration configuration;
 
+  /** Kafka prefix to form the URI for authentication */
+  private static final String KAFKA_PREFIX = "kafka:";

Review comment:
       Nit: Should we call this constant the same across all handlers instead it being specific to each handler? Something like URI_PREFIX instead of KAFKA_PREFIX, HBASE_PREFIX etc. 

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/events/AlterTableEvent.java
##########
@@ -101,6 +112,36 @@ private HiveOperationType getOperationType() {
       ret.add(getHivePrivilegeObjectDfsUri(newUri));
     }
 
+    if(newTable.getParameters().containsKey(hive_metastoreConstants.META_TABLE_STORAGE)) {
+      String storageUri = "";
+      DefaultStorageHandler defaultStorageHandler = null;
+      HiveStorageHandler hiveStorageHandler = null;
+      Configuration conf = new Configuration();
+      Map<String, String> tableProperties = new HashMap<>();
+      tableProperties.putAll(newTable.getSd().getSerdeInfo().getParameters());
+      tableProperties.putAll(newTable.getParameters());
+      try {
+        hiveStorageHandler = (HiveStorageHandler) ReflectionUtils.newInstance(
+                conf.getClassByName(newTable.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE)), event.getHandler().getConf());
+        Method methodIsImplemented = hiveStorageHandler.getClass().getMethod("getURIForAuth", Map.class);
+        if(methodIsImplemented != null && hiveStorageHandler instanceof DefaultStorageHandler) {
+          DefaultStorageHandler defaultHandler = (DefaultStorageHandler) ReflectionUtils.newInstance(
+                  conf.getClassByName(newTable.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE)), event.getHandler().getConf());
+          storageUri = defaultHandler.getURIForAuth(tableProperties).toString();
+        }else if(methodIsImplemented != null && hiveStorageHandler instanceof HiveStorageAuthorizationHandler){
+          HiveStorageAuthorizationHandler authorizationHandler = (HiveStorageAuthorizationHandler) ReflectionUtils.newInstance(
+                  conf.getClassByName(newTable.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE)), event.getHandler().getConf());
+          storageUri = authorizationHandler.getURIForAuth(tableProperties).toString();
+        }
+      }catch(Exception ex){
+        //Custom storage handler that has not implemented the getURIForAuth()
+        storageUri = hiveStorageHandler.getClass().getName()+"://"+

Review comment:
       if above comment makes sense, then this should go into the else block of the condition instead of the exception handler.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/command/CommandAuthorizerV2.java
##########
@@ -185,6 +189,38 @@ private static void addHivePrivObject(Entity privObject, Map<String, List<String
           tableName2Cols.get(Table.getCompleteName(table.getDbName(), table.getTableName()));
       hivePrivObject = new HivePrivilegeObject(privObjType, table.getDbName(), table.getTableName(),
           null, columns, actionType, null, null, table.getOwner(), table.getOwnerType());
+      if(table.getStorageHandler() != null){
+        //TODO: add hive privilege object for storage based handlers for create and alter table commands.
+        if(hiveOpType == HiveOperationType.CREATETABLE ||
+                hiveOpType == HiveOperationType.ALTERTABLE_PROPERTIES ||
+                hiveOpType == HiveOperationType.CREATETABLE_AS_SELECT){
+          String storageuri = null;
+          Map<String, String> tableProperties = new HashMap<>();
+          Configuration conf = new Configuration();
+          tableProperties.putAll(table.getSd().getSerdeInfo().getParameters());
+          tableProperties.putAll(table.getParameters());
+          try {
+            Method methodIsImplemented = table.getStorageHandler().getClass().getMethod("getURIForAuth", Map.class);
+            if(methodIsImplemented != null && table.getStorageHandler() instanceof DefaultStorageHandler) {
+              DefaultStorageHandler defaultHandler = (DefaultStorageHandler) ReflectionUtils.newInstance(
+                      conf.getClassByName(table.getStorageHandler().getClass().getName()), SessionState.get().getConf());
+              storageuri = defaultHandler.getURIForAuth(tableProperties).toString();
+            }else if(methodIsImplemented != null && table.getStorageHandler() instanceof HiveStorageAuthorizationHandler){
+              HiveStorageAuthorizationHandler authorizationHandler = (HiveStorageAuthorizationHandler) ReflectionUtils.newInstance(
+                      conf.getClassByName(table.getStorageHandler().getClass().getName()), SessionState.get().getConf());
+              storageuri = authorizationHandler.getURIForAuth(tableProperties).toString();
+            }
+          }catch(NullPointerException nullExp){
+            throw nullExp;
+          }catch(Exception ex){
+            //Custom storage handler that has not implemented the getURIForAuth()
+            storageuri = table.getStorageHandler().getClass().getName()+"://"+

Review comment:
       if the above change makes sense, then this code should go into the else block instead of the exception handler.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/events/AlterTableEvent.java
##########
@@ -116,4 +157,13 @@ private String buildCommandString(String cmdStr, Table tbl) {
     }
     return ret;
   }
+
+  private static String getTablePropsForCustomStorageHandler(Map<String, String> tableProperties){

Review comment:
       should we consolidate these 2 implementations of this method into some utility class?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##########
@@ -13707,14 +13709,25 @@ ASTNode analyzeCreateTable(
 
   /** Adds entities for create table/create view. */
   private void addDbAndTabToOutputs(String[] qualifiedTabName, TableType type,
-      boolean isTemporary, Map<String, String> tblProps) throws SemanticException {
+      boolean isTemporary, Map<String, String> tblProps, StorageFormat storageFormat) throws SemanticException {
     Database database  = getDatabase(qualifiedTabName[0]);
     outputs.add(new WriteEntity(database, WriteEntity.WriteType.DDL_SHARED));
 
     Table t = new Table(qualifiedTabName[0], qualifiedTabName[1]);
     t.setParameters(tblProps);
     t.setTableType(type);
     t.setTemporary(isTemporary);
+    HiveStorageHandler storageHandler = null;
+    try {
+      storageHandler = (HiveStorageHandler) ReflectionUtils.newInstance(
+              conf.getClassByName(storageFormat.getStorageHandler()), SessionState.get().getConf());
+    }catch(ClassNotFoundException ex){
+      System.out.println("Class not found. Storage handler will be set to null: "+ex);
+    }
+    t.setStorageHandler(storageHandler);
+    for(Map.Entry<String,String> serdeMap : storageFormat.getSerdeProps().entrySet()){

Review comment:
       is there a reason we are setting serde params on the table params? can you give me an example of why we want to do this?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/command/CommandAuthorizerV2.java
##########
@@ -185,6 +189,38 @@ private static void addHivePrivObject(Entity privObject, Map<String, List<String
           tableName2Cols.get(Table.getCompleteName(table.getDbName(), table.getTableName()));
       hivePrivObject = new HivePrivilegeObject(privObjType, table.getDbName(), table.getTableName(),
           null, columns, actionType, null, null, table.getOwner(), table.getOwnerType());
+      if(table.getStorageHandler() != null){
+        //TODO: add hive privilege object for storage based handlers for create and alter table commands.
+        if(hiveOpType == HiveOperationType.CREATETABLE ||
+                hiveOpType == HiveOperationType.ALTERTABLE_PROPERTIES ||
+                hiveOpType == HiveOperationType.CREATETABLE_AS_SELECT){
+          String storageuri = null;
+          Map<String, String> tableProperties = new HashMap<>();
+          Configuration conf = new Configuration();
+          tableProperties.putAll(table.getSd().getSerdeInfo().getParameters());
+          tableProperties.putAll(table.getParameters());
+          try {
+            Method methodIsImplemented = table.getStorageHandler().getClass().getMethod("getURIForAuth", Map.class);
+            if(methodIsImplemented != null && table.getStorageHandler() instanceof DefaultStorageHandler) {

Review comment:
       so DefaultStorageHandler implements the "HiveStorageAuthorizationHandler" interface right? what is the difference between these if else conditions.
   Feels like the code should do something like this
   if (table.getStorageHandler() instancefo HiveStorageAuthorizationHandler) {
    storageuri = ((HiveStorageAuthorizationHandler) table.getStorageHandler()).getURIForAuth(tableProperties)'
   }
   

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/command/CommandAuthorizerV2.java
##########
@@ -185,6 +189,38 @@ private static void addHivePrivObject(Entity privObject, Map<String, List<String
           tableName2Cols.get(Table.getCompleteName(table.getDbName(), table.getTableName()));
       hivePrivObject = new HivePrivilegeObject(privObjType, table.getDbName(), table.getTableName(),
           null, columns, actionType, null, null, table.getOwner(), table.getOwnerType());
+      if(table.getStorageHandler() != null){
+        //TODO: add hive privilege object for storage based handlers for create and alter table commands.
+        if(hiveOpType == HiveOperationType.CREATETABLE ||
+                hiveOpType == HiveOperationType.ALTERTABLE_PROPERTIES ||
+                hiveOpType == HiveOperationType.CREATETABLE_AS_SELECT){
+          String storageuri = null;
+          Map<String, String> tableProperties = new HashMap<>();
+          Configuration conf = new Configuration();
+          tableProperties.putAll(table.getSd().getSerdeInfo().getParameters());
+          tableProperties.putAll(table.getParameters());
+          try {
+            Method methodIsImplemented = table.getStorageHandler().getClass().getMethod("getURIForAuth", Map.class);

Review comment:
       same here. Wouldnt a check just to see if the class is implementing the new interface, being added, enough? instead of checking to see if the method is implemented?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 572212)
    Time Spent: 40m  (was: 0.5h)

> Create/Alter/Drop tables based on storage handlers in HS2 should be authorized by Ranger/Sentry
> -----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-24705
>                 URL: https://issues.apache.org/jira/browse/HIVE-24705
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sai Hemanth Gantasala
>            Assignee: Sai Hemanth Gantasala
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> With doAs=false in Hive3.x, whenever a user is trying to create a table based on storage handlers on external storage for ex: HBase table, the end user we are seeing is hive so we cannot really enforce the condition in Apache Ranger/Sentry on the end-user. So, we need to enforce this condition in the hive in the event of create/alter/drop tables based on storage handlers.
> Built-in hive storage handlers like HbaseStorageHandler, KafkaStorageHandler e.t.c should implement a method getURIForAuthentication() which returns a URI that is formed from table properties. This URI can be sent for authorization to Ranger/Sentry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)