You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/03/17 13:12:01 UTC

[GitHub] [iceberg] rymurr commented on a change in pull request #2345: Nessie: assert that `NessieCatalog` is initialized

rymurr commented on a change in pull request #2345:
URL: https://github.com/apache/iceberg/pull/2345#discussion_r595972212



##########
File path: nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java
##########
@@ -78,31 +78,63 @@
   private UpdateableReference reference;
   private String name;
   private FileIO fileIO;
+  private boolean initialized;
 
   public NessieCatalog() {
   }
 
   @Override
   public void initialize(String inputName, Map<String, String> options) {
-    String fileIOImpl = options.get(CatalogProperties.FILE_IO_IMPL);
-    this.fileIO = fileIOImpl == null ? new HadoopFileIO(config) : CatalogUtil.loadFileIO(fileIOImpl, options, config);
-    this.name = inputName == null ? "nessie" : inputName;
-    // remove nessie prefix
-    final Function<String, String> removePrefix = x -> x.replace("nessie.", "");
+    if (initialized) {
+      close();
+    }
+    if (config == null) {
+      throw new IllegalStateException(String.format("setConf() must be called before initialize() for NessieCatalog '%s'", name));

Review comment:
       Iceberg logs are of the format `Cannot <do something>. <reason>. [possible fix]` 

##########
File path: nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java
##########
@@ -78,31 +78,63 @@
   private UpdateableReference reference;
   private String name;
   private FileIO fileIO;
+  private boolean initialized;
 
   public NessieCatalog() {
   }
 
   @Override
   public void initialize(String inputName, Map<String, String> options) {
-    String fileIOImpl = options.get(CatalogProperties.FILE_IO_IMPL);
-    this.fileIO = fileIOImpl == null ? new HadoopFileIO(config) : CatalogUtil.loadFileIO(fileIOImpl, options, config);
-    this.name = inputName == null ? "nessie" : inputName;
-    // remove nessie prefix
-    final Function<String, String> removePrefix = x -> x.replace("nessie.", "");
+    if (initialized) {

Review comment:
       I am not sure we should support 2x initialization

##########
File path: nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java
##########
@@ -78,31 +78,63 @@
   private UpdateableReference reference;
   private String name;
   private FileIO fileIO;
+  private boolean initialized;
 
   public NessieCatalog() {
   }
 
   @Override
   public void initialize(String inputName, Map<String, String> options) {
-    String fileIOImpl = options.get(CatalogProperties.FILE_IO_IMPL);
-    this.fileIO = fileIOImpl == null ? new HadoopFileIO(config) : CatalogUtil.loadFileIO(fileIOImpl, options, config);
-    this.name = inputName == null ? "nessie" : inputName;
-    // remove nessie prefix
-    final Function<String, String> removePrefix = x -> x.replace("nessie.", "");
+    if (initialized) {
+      close();
+    }
+    if (config == null) {
+      throw new IllegalStateException(String.format("setConf() must be called before initialize() for NessieCatalog '%s'", name));
+    }
+
+    try {
+      String fileIOImpl = options.get(CatalogProperties.FILE_IO_IMPL);
+      this.fileIO = fileIOImpl == null ? new HadoopFileIO(config) : CatalogUtil.loadFileIO(fileIOImpl, options, config);
+      this.name = inputName == null ? "nessie" : inputName;
+      // remove nessie prefix
+      final Function<String, String> removePrefix = x -> x.replace("nessie.", "");
 
-    this.client = NessieClient.builder().fromConfig(x -> options.get(removePrefix.apply(x))).build();
+      this.client = NessieClient.builder().fromConfig(x -> options.get(removePrefix.apply(x))).build();
 
-    this.warehouseLocation = options.get(CatalogProperties.WAREHOUSE_LOCATION);
-    if (warehouseLocation == null) {
-      throw new IllegalStateException("Parameter warehouse not set, nessie can't store data.");
+      this.warehouseLocation = options.get(CatalogProperties.WAREHOUSE_LOCATION);
+      if (warehouseLocation == null) {
+        throw new IllegalStateException("Parameter warehouse not set, nessie can't store data.");
+      }
+      final String requestedRef = options.get(removePrefix.apply(NessieConfigConstants.CONF_NESSIE_REF));
+      this.reference = loadReference(requestedRef);
+
+      this.initialized = true;
+    } catch (RuntimeException e) {
+      close();
+      throw new RuntimeException("Failed to initialize NessieCatalog with options " + options);
+    }
+  }
+
+  private void assertInitialized() {

Review comment:
       All catalogs have to perform the same steps: {no-arg constructor, setConf, initialize} in that order. I wonder if there is a way to extend this check beyond the nessie catalog. Some ideas:
   
   * enforce in base class
   * don't support initialization outside of `CatalogUtil`
   
   It seems like adding this logic into every `Catalog` and expecting users to know how they have to be constructed in order is fragile
   
   @rdblue @aokolnychyi @RussellSpitzer what do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org