You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/21 14:59:58 UTC

[GitHub] [iceberg] rymurr commented on a change in pull request #2129: Hive: Configure catalog type on table level.

rymurr commented on a change in pull request #2129:
URL: https://github.com/apache/iceberg/pull/2129#discussion_r561945734



##########
File path: mr/src/main/java/org/apache/iceberg/mr/Catalogs.java
##########
@@ -180,47 +186,80 @@ public static boolean dropTable(Configuration conf, Properties props) {
   /**
    * Returns true if HiveCatalog is used
    * @param conf a Hadoop conf
+   * @param props the controlling properties
    * @return true if the Catalog is HiveCatalog
    */
-  public static boolean hiveCatalog(Configuration conf) {
-    return HIVE.equalsIgnoreCase(conf.get(InputFormatConfig.CATALOG));
+  public static boolean hiveCatalog(Configuration conf, Properties props) {
+    String catalogName = props.getProperty(InputFormatConfig.TABLE_CATALOG);
+    if (catalogName != null) {
+      return HIVE.equals(conf.get(String.format(InputFormatConfig.CATALOG_TYPE_TEMPLATE, catalogName)));
+    } else {
+      if (HIVE.equals(conf.get(String.format(InputFormatConfig.CATALOG_TYPE_TEMPLATE, "default")))) {
+        return true;
+      } else {
+        return HIVE.equalsIgnoreCase(conf.get(InputFormatConfig.CATALOG));
+      }
+    }
   }
 
   @VisibleForTesting
-  static Optional<Catalog> loadCatalog(Configuration conf) {
-    String catalogLoaderClass = conf.get(InputFormatConfig.CATALOG_LOADER_CLASS);
-
-    if (catalogLoaderClass != null) {
-      CatalogLoader loader = (CatalogLoader) DynConstructors.builder(CatalogLoader.class)
-              .impl(catalogLoaderClass)
-              .build()
-              .newInstance();
-      Catalog catalog = loader.load(conf);
-      LOG.info("Loaded catalog {} using {}", catalog, catalogLoaderClass);
-      return Optional.of(catalog);
+  static Optional<Catalog> loadCatalog(Configuration conf, String catalogName) {
+    String catalogType;
+    String name = catalogName;
+    if (name == null) {
+      name = "default";
     }
+    catalogType = conf.get(String.format(InputFormatConfig.CATALOG_TYPE_TEMPLATE, name));
 
-    String catalogName = conf.get(InputFormatConfig.CATALOG);
+    // keep both catalog configuration methods for seamless transition
+    if (catalogType != null) {
+    // new logic
+      return loadCatalog(conf, catalogType, String.format(InputFormatConfig.CATALOG_WAREHOUSE_TEMPLATE, name),
+              String.format(InputFormatConfig.CATALOG_LOADER_CLASS_TEMPLATE, name));
+    } else {
+      // old logic
+      // use catalog {@link InputFormatConfig.CATALOG} stored in global hive config if table specific catalog
+      // configuration or default catalog definition is missing
+      catalogType = conf.get(InputFormatConfig.CATALOG);
+      return loadCatalog(conf, catalogType, InputFormatConfig.HADOOP_CATALOG_WAREHOUSE_LOCATION,
+              InputFormatConfig.CATALOG_LOADER_CLASS);
+    }
+  }
 
-    if (catalogName != null) {
-      Catalog catalog;
-      switch (catalogName.toLowerCase()) {
-        case HADOOP:
-          String warehouseLocation = conf.get(InputFormatConfig.HADOOP_CATALOG_WAREHOUSE_LOCATION);
+  private static Optional<Catalog> loadCatalog(Configuration conf, String catalogType,
+                                               String warehouseLocationConfigName, String loaderClassConfigName) {
+    if (catalogType == null) {
+      LOG.info("Catalog is not configured");
+      return Optional.empty();
+    }
 
-          catalog = (warehouseLocation != null) ? new HadoopCatalog(conf, warehouseLocation) : new HadoopCatalog(conf);
-          LOG.info("Loaded Hadoop catalog {}", catalog);
+    Catalog catalog;
+    switch (catalogType.toLowerCase()) {
+      case HADOOP:
+        String warehouseLocation = conf.get(warehouseLocationConfigName);
+        catalog = (warehouseLocation != null) ? new HadoopCatalog(conf, warehouseLocation) :
+                new HadoopCatalog(conf);
+        LOG.info("Loaded Hadoop catalog {}", catalog);
+        return Optional.of(catalog);
+      case HIVE:
+        catalog = HiveCatalogs.loadCatalog(conf);
+        LOG.info("Loaded Hive Metastore catalog {}", catalog);
+        return Optional.of(catalog);
+      case CUSTOM:
+        String catalogLoaderClass = conf.get(loaderClassConfigName);
+        if (catalogLoaderClass != null) {
+          CatalogLoader loader = (CatalogLoader) DynConstructors.builder(CatalogLoader.class)

Review comment:
       I wonder if the `CatalogLoader` class is needed at all. `CatalogUtil` already has a method to get a catalog from class name and configuration parameters which is used in Spark and Flink.
   
   The `CatalogLoader` interface also creates potentially weird dependencies between the catalog modules and the mr module. When implementing this interface for the Nessie catalog I had to depend on the mr module for only that interface. 
   
   I would be in favour of Deprecating `CatalogLoader` in favour of `CatalogUtil`. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org