You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/12/06 01:38:05 UTC

[GitHub] [iceberg] rdblue opened a new pull request #1879: Core: Support TableBuilder in CachingCatalog

rdblue opened a new pull request #1879:
URL: https://github.com/apache/iceberg/pull/1879


   The TableBuilder makes it easier to use catalogs from Spark, but the CachingCatalog doesn't currently support the builder so tests fail. This implements the builder for the CachingCatalog and uses it for the existing transaction methods. The existing tests should now exercise the builder.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1879: Core: Support TableBuilder in CachingCatalog

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1879:
URL: https://github.com/apache/iceberg/pull/1879#discussion_r537401615



##########
File path: core/src/main/java/org/apache/iceberg/CachingCatalog.java
##########
@@ -160,4 +159,90 @@ private void invalidate(TableIdentifier ident) {
 
     return builder.build();
   }
+
+  @Override
+  public TableBuilder buildTable(TableIdentifier identifier, Schema schema) {
+    return new CachingTableBuilder(identifier, schema);
+  }
+
+  private class CachingTableBuilder implements TableBuilder {
+    private final TableIdentifier ident;
+    private final TableBuilder innerBuilder;
+
+    private CachingTableBuilder(TableIdentifier identifier, Schema schema) {
+      this.innerBuilder = catalog.buildTable(identifier, schema);
+      this.ident = identifier;
+    }
+
+    @Override
+    public TableBuilder withPartitionSpec(PartitionSpec spec) {
+      innerBuilder.withPartitionSpec(spec);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withSortOrder(SortOrder sortOrder) {
+      innerBuilder.withSortOrder(sortOrder);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withLocation(String location) {
+      innerBuilder.withLocation(location);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withProperties(Map<String, String> properties) {
+      innerBuilder.withProperties(properties);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withProperty(String key, String value) {
+      innerBuilder.withProperty(key, value);
+      return this;
+    }
+
+    @Override
+    public Table create() {
+      AtomicBoolean created = new AtomicBoolean(false);
+      Table table = tableCache.get(canonicalizeIdentifier(ident), identifier -> {
+        created.set(true);
+        return innerBuilder.create();
+      });
+
+      if (!created.get()) {
+        throw new AlreadyExistsException("Table already exists: %s", ident);
+      }
+
+      return table;
+    }
+
+    @Override
+    public Transaction createTransaction() {
+      // create a new transaction without altering the cache. the table doesn't exist until the transaction is
+      // committed. if the table is created before the transaction commits, any cached version is correct and the
+      // transaction create will fail. if the transaction commits before another create, then the cache will be empty.
+      return innerBuilder.createTransaction();
+    }
+
+    @Override
+    public Transaction replaceTransaction() {
+      // create a new transaction without altering the cache. the table doesn't change until the transaction is
+      // committed. when the transaction commits, invalidate the table in the cache if it is present.
+      return CommitCallbackTransaction.addCallback(
+          innerBuilder.replaceTransaction(),
+          () -> tableCache.invalidate(canonicalizeIdentifier(ident)));

Review comment:
       We are calling `invalidate` on `tableCache` directly instead of the dedicated method in `CachingCatalog` that also invalidates metadata tables. I think we should keep the old logic.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1879: Core: Support TableBuilder in CachingCatalog

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1879:
URL: https://github.com/apache/iceberg/pull/1879#discussion_r537688110



##########
File path: core/src/main/java/org/apache/iceberg/CachingCatalog.java
##########
@@ -160,4 +159,90 @@ private void invalidate(TableIdentifier ident) {
 
     return builder.build();
   }
+
+  @Override
+  public TableBuilder buildTable(TableIdentifier identifier, Schema schema) {
+    return new CachingTableBuilder(identifier, schema);
+  }
+
+  private class CachingTableBuilder implements TableBuilder {
+    private final TableIdentifier ident;
+    private final TableBuilder innerBuilder;
+
+    private CachingTableBuilder(TableIdentifier identifier, Schema schema) {
+      this.innerBuilder = catalog.buildTable(identifier, schema);
+      this.ident = identifier;
+    }
+
+    @Override
+    public TableBuilder withPartitionSpec(PartitionSpec spec) {
+      innerBuilder.withPartitionSpec(spec);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withSortOrder(SortOrder sortOrder) {
+      innerBuilder.withSortOrder(sortOrder);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withLocation(String location) {
+      innerBuilder.withLocation(location);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withProperties(Map<String, String> properties) {
+      innerBuilder.withProperties(properties);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withProperty(String key, String value) {
+      innerBuilder.withProperty(key, value);
+      return this;
+    }
+
+    @Override
+    public Table create() {
+      AtomicBoolean created = new AtomicBoolean(false);
+      Table table = tableCache.get(canonicalizeIdentifier(ident), identifier -> {
+        created.set(true);
+        return innerBuilder.create();
+      });
+
+      if (!created.get()) {
+        throw new AlreadyExistsException("Table already exists: %s", ident);
+      }
+
+      return table;
+    }
+
+    @Override
+    public Transaction createTransaction() {
+      // create a new transaction without altering the cache. the table doesn't exist until the transaction is
+      // committed. if the table is created before the transaction commits, any cached version is correct and the
+      // transaction create will fail. if the transaction commits before another create, then the cache will be empty.
+      return innerBuilder.createTransaction();
+    }
+
+    @Override
+    public Transaction replaceTransaction() {
+      // create a new transaction without altering the cache. the table doesn't change until the transaction is
+      // committed. when the transaction commits, invalidate the table in the cache if it is present.
+      return CommitCallbackTransaction.addCallback(
+          innerBuilder.replaceTransaction(),
+          () -> tableCache.invalidate(canonicalizeIdentifier(ident)));

Review comment:
       Fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #1879: Core: Support TableBuilder in CachingCatalog

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #1879:
URL: https://github.com/apache/iceberg/pull/1879


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1879: Core: Support TableBuilder in CachingCatalog

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1879:
URL: https://github.com/apache/iceberg/pull/1879#discussion_r537687080



##########
File path: core/src/main/java/org/apache/iceberg/CachingCatalog.java
##########
@@ -160,4 +159,90 @@ private void invalidate(TableIdentifier ident) {
 
     return builder.build();
   }
+
+  @Override
+  public TableBuilder buildTable(TableIdentifier identifier, Schema schema) {
+    return new CachingTableBuilder(identifier, schema);
+  }
+
+  private class CachingTableBuilder implements TableBuilder {
+    private final TableIdentifier ident;
+    private final TableBuilder innerBuilder;
+
+    private CachingTableBuilder(TableIdentifier identifier, Schema schema) {
+      this.innerBuilder = catalog.buildTable(identifier, schema);
+      this.ident = identifier;
+    }
+
+    @Override
+    public TableBuilder withPartitionSpec(PartitionSpec spec) {
+      innerBuilder.withPartitionSpec(spec);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withSortOrder(SortOrder sortOrder) {
+      innerBuilder.withSortOrder(sortOrder);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withLocation(String location) {
+      innerBuilder.withLocation(location);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withProperties(Map<String, String> properties) {
+      innerBuilder.withProperties(properties);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withProperty(String key, String value) {
+      innerBuilder.withProperty(key, value);
+      return this;
+    }
+
+    @Override
+    public Table create() {
+      AtomicBoolean created = new AtomicBoolean(false);
+      Table table = tableCache.get(canonicalizeIdentifier(ident), identifier -> {
+        created.set(true);
+        return innerBuilder.create();
+      });
+
+      if (!created.get()) {
+        throw new AlreadyExistsException("Table already exists: %s", ident);
+      }
+
+      return table;
+    }
+
+    @Override
+    public Transaction createTransaction() {
+      // create a new transaction without altering the cache. the table doesn't exist until the transaction is
+      // committed. if the table is created before the transaction commits, any cached version is correct and the
+      // transaction create will fail. if the transaction commits before another create, then the cache will be empty.
+      return innerBuilder.createTransaction();
+    }
+
+    @Override
+    public Transaction replaceTransaction() {
+      // create a new transaction without altering the cache. the table doesn't change until the transaction is
+      // committed. when the transaction commits, invalidate the table in the cache if it is present.
+      return CommitCallbackTransaction.addCallback(
+          innerBuilder.replaceTransaction(),
+          () -> tableCache.invalidate(canonicalizeIdentifier(ident)));

Review comment:
       Good catch. I hit this on Thursday in our branch based on 0.9.0 and backported it here, which is why I missed it. Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #1879: Core: Support TableBuilder in CachingCatalog

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #1879:
URL: https://github.com/apache/iceberg/pull/1879#issuecomment-739834121


   This PR looks good to me but I think we should keep calling the `invalidate` method in `CachingCatalog`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1879: Core: Support TableBuilder in CachingCatalog

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1879:
URL: https://github.com/apache/iceberg/pull/1879#discussion_r537401615



##########
File path: core/src/main/java/org/apache/iceberg/CachingCatalog.java
##########
@@ -160,4 +159,90 @@ private void invalidate(TableIdentifier ident) {
 
     return builder.build();
   }
+
+  @Override
+  public TableBuilder buildTable(TableIdentifier identifier, Schema schema) {
+    return new CachingTableBuilder(identifier, schema);
+  }
+
+  private class CachingTableBuilder implements TableBuilder {
+    private final TableIdentifier ident;
+    private final TableBuilder innerBuilder;
+
+    private CachingTableBuilder(TableIdentifier identifier, Schema schema) {
+      this.innerBuilder = catalog.buildTable(identifier, schema);
+      this.ident = identifier;
+    }
+
+    @Override
+    public TableBuilder withPartitionSpec(PartitionSpec spec) {
+      innerBuilder.withPartitionSpec(spec);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withSortOrder(SortOrder sortOrder) {
+      innerBuilder.withSortOrder(sortOrder);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withLocation(String location) {
+      innerBuilder.withLocation(location);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withProperties(Map<String, String> properties) {
+      innerBuilder.withProperties(properties);
+      return this;
+    }
+
+    @Override
+    public TableBuilder withProperty(String key, String value) {
+      innerBuilder.withProperty(key, value);
+      return this;
+    }
+
+    @Override
+    public Table create() {
+      AtomicBoolean created = new AtomicBoolean(false);
+      Table table = tableCache.get(canonicalizeIdentifier(ident), identifier -> {
+        created.set(true);
+        return innerBuilder.create();
+      });
+
+      if (!created.get()) {
+        throw new AlreadyExistsException("Table already exists: %s", ident);
+      }
+
+      return table;
+    }
+
+    @Override
+    public Transaction createTransaction() {
+      // create a new transaction without altering the cache. the table doesn't exist until the transaction is
+      // committed. if the table is created before the transaction commits, any cached version is correct and the
+      // transaction create will fail. if the transaction commits before another create, then the cache will be empty.
+      return innerBuilder.createTransaction();
+    }
+
+    @Override
+    public Transaction replaceTransaction() {
+      // create a new transaction without altering the cache. the table doesn't change until the transaction is
+      // committed. when the transaction commits, invalidate the table in the cache if it is present.
+      return CommitCallbackTransaction.addCallback(
+          innerBuilder.replaceTransaction(),
+          () -> tableCache.invalidate(canonicalizeIdentifier(ident)));

Review comment:
       Hm, is this correct? We are calling `invalidate` on `tableCache` directly instead of the dedicated method in `CachingCatalog` that also invalidates metadata tables.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1879: Core: Support TableBuilder in CachingCatalog

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1879:
URL: https://github.com/apache/iceberg/pull/1879#discussion_r536930541



##########
File path: core/src/main/java/org/apache/iceberg/CachingCatalog.java
##########
@@ -99,36 +99,35 @@ public Table loadTable(TableIdentifier ident) {
   @Override
   public Table createTable(TableIdentifier ident, Schema schema, PartitionSpec spec, String location,
                            Map<String, String> properties) {
-    AtomicBoolean created = new AtomicBoolean(false);
-    Table table = tableCache.get(canonicalizeIdentifier(ident), identifier -> {
-      created.set(true);
-      return catalog.createTable(identifier, schema, spec, location, properties);
-    });
-
-    if (!created.get()) {
-      throw new AlreadyExistsException("Table already exists: %s", ident);
-    }
-
-    return table;
+    return buildTable(ident, schema)
+        .withPartitionSpec(spec)
+        .withLocation(location)
+        .withProperties(properties)
+        .create();

Review comment:
       We should consider moving these to be default implementations in the interface.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #1879: Core: Support TableBuilder in CachingCatalog

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #1879:
URL: https://github.com/apache/iceberg/pull/1879#issuecomment-740099898


   Merged this. Thanks for reviewing, @rymurr and @aokolnychyi!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org