You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/06 21:12:23 UTC

[GitHub] [iceberg] kbendick commented on a change in pull request #3275: Support for Namespace properties in JDBC Catalog

kbendick commented on a change in pull request #3275:
URL: https://github.com/apache/iceberg/pull/3275#discussion_r779861519



##########
File path: core/src/test/java/org/apache/iceberg/jdbc/TestJdbcCatalog.java
##########
@@ -555,6 +556,66 @@ public void testDropNamespace() {
         "is not empty. 1 tables exist.", () -> catalog.dropNamespace(tbl4.namespace()));
   }
 
+  @Test
+  public void testCreateNamespace() {
+    Namespace testNamespace = Namespace.of("testDb", "ns1", "ns2");
+    // Test with null metadata
+    AssertHelpers.assertThrows("Cannot create a namespace with null or empty metadata", IllegalArgumentException.class,
+            () -> catalog.createNamespace(testNamespace, null));
+    Assert.assertFalse(catalog.namespaceExists(testNamespace));
+
+    // Test with metadata
+    Map<String, String> testMetadata = ImmutableMap.of("key_1", "value_1", "key_2", "value_2", "key_3", "value_3");
+    catalog.createNamespace(testNamespace, testMetadata);
+    Assert.assertTrue(catalog.namespaceExists(testNamespace));
+  }
+
+  @Test
+  public void testSetProperties() {
+    Namespace testNamespace = Namespace.of("testDb", "ns1", "ns2");
+    Map<String, String> testMetadata = ImmutableMap.of("key_1", "value_1", "key_2", "value_2",
+            "key_3", "value_3");
+    catalog.createNamespace(testNamespace, testMetadata);
+
+    // Add more properties to set to test insert and update
+    Map<String, String> propertiesToSet = ImmutableMap.of("key_1", "new_value_1", "key_2", "new_value_2", "key_3",
+            "new_value_3", "key_4", "value_4", "key_5", "value_5");

Review comment:
       Can we add the properties to set in an order where the keys don't already align with the ordering of the existing metadata?
   
   I'm wondering if we need to add a check if we're at the right row number when calling the `updatePrperties` function. I think even just removing `key_2` from here would trigger this case.

##########
File path: core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java
##########
@@ -250,10 +265,95 @@ public void setConf(Configuration conf) {
     this.conf = conf;
   }
 
+  private boolean insertProperties(Namespace namespace, Map<String, String> properties) {
+    String namespaceName = JdbcUtil.namespaceToString(namespace);
+
+    try {
+      int insertedRecords = connections.run(conn -> {
+        String sqlStatement = JdbcUtil.insertPropertiesStatement(properties.size());
+
+        try (PreparedStatement sql = conn.prepareStatement(sqlStatement)) {
+          int rowIndex = 0;
+          for (Map.Entry<String, String> keyValue : properties.entrySet()) {
+            sql.setString(rowIndex + 1, catalogName);
+            sql.setString(rowIndex + 2, namespaceName);
+            sql.setString(rowIndex + 3, keyValue.getKey());
+            sql.setString(rowIndex + 4, keyValue.getValue());
+            rowIndex += 4;
+          }
+          return sql.executeUpdate();
+        }
+      });
+
+      if (insertedRecords == properties.size()) {
+        LOG.debug("Successfully inserted {} properties for namespace {}", properties, namespaceName);
+        return true;
+      } else {
+        throw new IllegalStateException();
+      }
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+      throw new UncheckedInterruptedException(e, "Interrupted in call to insertProperties(namespace, properties) " +
+          "Namespace: %s", namespace);
+    } catch (SQLException e) {
+      throw new UncheckedSQLException(e, "Failed to insertProperties to namespace: %s in catalog: %s", namespace,

Review comment:
       Nit: We try to keep error messages as human-readable strings as opposed to using function names as users likely won't know the names of functions anyways (and they can be logged in stack traces).
   
   So possibly this should be `Failed to insert properties to namespace ...`

##########
File path: core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java
##########
@@ -250,10 +265,95 @@ public void setConf(Configuration conf) {
     this.conf = conf;
   }
 
+  private boolean insertProperties(Namespace namespace, Map<String, String> properties) {
+    String namespaceName = JdbcUtil.namespaceToString(namespace);
+
+    try {
+      int insertedRecords = connections.run(conn -> {
+        String sqlStatement = JdbcUtil.insertPropertiesStatement(properties.size());
+
+        try (PreparedStatement sql = conn.prepareStatement(sqlStatement)) {
+          int rowIndex = 0;
+          for (Map.Entry<String, String> keyValue : properties.entrySet()) {
+            sql.setString(rowIndex + 1, catalogName);
+            sql.setString(rowIndex + 2, namespaceName);
+            sql.setString(rowIndex + 3, keyValue.getKey());
+            sql.setString(rowIndex + 4, keyValue.getValue());
+            rowIndex += 4;
+          }
+          return sql.executeUpdate();
+        }
+      });
+
+      if (insertedRecords == properties.size()) {
+        LOG.debug("Successfully inserted {} properties for namespace {}", properties, namespaceName);
+        return true;
+      } else {
+        throw new IllegalStateException();
+      }
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+      throw new UncheckedInterruptedException(e, "Interrupted in call to insertProperties(namespace, properties) " +
+          "Namespace: %s", namespace);
+    } catch (SQLException e) {
+      throw new UncheckedSQLException(e, "Failed to insertProperties to namespace: %s in catalog: %s", namespace,
+          catalogName);
+    }
+  }
+
+  private boolean updateProperties(Namespace namespace, Map<String, String> properties) {
+    String namespaceName = JdbcUtil.namespaceToString(namespace);
+
+    try {
+      int updatedRecords = connections.run(conn -> {
+        String sqlStatement = JdbcUtil.updatePropertiesStatement(properties.size());
+
+        try (PreparedStatement sql = conn.prepareStatement(sqlStatement)) {
+          int rowIndex = 0;
+          for (Map.Entry<String, String> keyValue : properties.entrySet()) {
+            sql.setString(rowIndex + 1, keyValue.getKey());
+            sql.setString(rowIndex + 2, keyValue.getValue());
+            rowIndex += 2;
+          }
+          sql.setString(rowIndex + 1, catalogName);
+          sql.setString(rowIndex + 2, namespaceName);
+          rowIndex += 2;
+          for (String key : properties.keySet()) {
+            sql.setString(rowIndex + 1, key);
+            rowIndex += 1;
+          }
+          LOG.info("Final log string {}", sql);
+          return sql.executeUpdate();
+        }
+      });
+
+      if (updatedRecords == properties.size()) {
+        LOG.debug("Successfully updated {} to new namespace: {}", properties, namespaceName);
+        return true;
+      } else {
+        throw new IllegalStateException();
+      }
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+      throw new UncheckedInterruptedException(e,
+          "Interrupted in call to updateProperties(namespace, properties) Namespace: %s", namespace);
+    } catch (SQLException e) {
+      throw new UncheckedSQLException(e, "Failed to updateProperties to namespace: %s in catalog: %s", namespace,
+              catalogName);
+    }
+  }
+
   @Override
   public void createNamespace(Namespace namespace, Map<String, String> metadata) {
-    throw new UnsupportedOperationException("Cannot create namespace " + namespace +
-        ": createNamespace is not supported");
+    if (namespaceExists(namespace)) {
+      throw new AlreadyExistsException("Namespace already exists: %s", namespace);
+    }
+
+    if (metadata == null || metadata.isEmpty()) {
+      throw new IllegalArgumentException("Cannot create a namespace with null or empty metadata");
+    }

Review comment:
       After looking at the reasoning for disallowing empty metadata, https://github.com/apache/iceberg/pull/3275/files#r741604136, should we add a short comment above to indicate why this can't be empty?
   
   Or, since we're using the metadata to indicate that a record exists, if users don't supply metadata, should we simply place some sort of value in there for them (possible choices I can think of might be a creation timestamp). This way users would be able to create namespaces without explicitly passing in metadata.
   
   Returning to this PR after a while so forgive me if there's been a reason covered that this is a bad idea already that I'm forgetting 😄 

##########
File path: core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java
##########
@@ -250,10 +265,93 @@ public void setConf(Configuration conf) {
     this.conf = conf;
   }
 
+  private boolean insertProperties(Namespace namespace, Map<String, String> properties) {
+    String namespaceName = JdbcUtil.namespaceToString(namespace);
+
+    try {
+      int insertedRecords = connections.run(conn -> {
+        String sqlStatement = JdbcUtil.insertPropertiesStatement(properties);
+
+        try (PreparedStatement sql = conn.prepareStatement(sqlStatement)) {
+          int rowIndex = 0;
+          for (Map.Entry<String, String> keyValue : properties.entrySet()) {
+            sql.setString(++rowIndex, catalogName);
+            sql.setString(++rowIndex, namespaceName);
+            sql.setString(++rowIndex, keyValue.getKey());
+            sql.setString(++rowIndex, keyValue.getValue());
+          }
+          return sql.executeUpdate();
+        }
+      });
+
+      if (insertedRecords == properties.size()) {
+        LOG.debug("Successfully committed to new namespace: {}", namespaceName);
+        return true;
+      } else {
+        throw new CommitFailedException("Failed to insertProperties to namespace %s in catalog %s", namespaceName,
+                catalogName);
+      }
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+      throw new UncheckedInterruptedException(e,
+              "Interrupted in call to insertProperties(namespace, properties) Namespace: %s", namespace);
+    } catch (SQLException e) {
+      throw new UncheckedSQLException(e, "Failed to insertProperties to namespace: %s in catalog: %s", namespace,
+              catalogName);
+    }
+  }
+
+  private boolean updateProperties(Namespace namespace, Map<String, String> properties) {

Review comment:
       I would personally say if there's no practical difference between the two, it would be ok to make them into one function.
   
   Possibly there could be one outer function, that contains all of the shared logic (like the interrupted exception and SQLException checks). And then two different functions that wrap that function with a code block for the building of the correct prepared statement. This might make the changes a bit easier to digest and be cleaner, but it's fine as it is otherwise.

##########
File path: core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java
##########
@@ -340,14 +440,119 @@ public boolean dropNamespace(Namespace namespace) throws NamespaceNotEmptyExcept
   @Override
   public boolean setProperties(Namespace namespace, Map<String, String> properties) throws
       NoSuchNamespaceException {
-    throw new UnsupportedOperationException(
-        "Cannot set properties " + namespace + " : setProperties is not supported");
+    if (!namespaceExists(namespace)) {
+      throw new NoSuchNamespaceException("Namespace does not exist: %s", namespace);
+    }
+
+    if (properties == null || properties.isEmpty()) {
+      throw new IllegalArgumentException("Cannot setProperties to a namespace with null or empty properties");
+    }

Review comment:
       Nit: This could probably be `Preconditions.checkArgument` or some variant of that.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org