You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/14 15:42:34 UTC

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6179: AWS: Re-tag files when renaming tables in GlueCatalog

jackye1995 commented on code in PR #6179:
URL: https://github.com/apache/iceberg/pull/6179#discussion_r1021702623


##########
aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java:
##########
@@ -624,4 +642,153 @@ public void setConf(Configuration conf) {
   protected Map<String, String> properties() {
     return catalogProperties == null ? ImmutableMap.of() : catalogProperties;
   }
+
+  private void updateTableTag(TableIdentifier from, TableIdentifier to) {
+    // should update tag when the rename process is successful
+    TableOperations ops = newTableOps(to);
+    TableMetadata lastMetadata = null;
+    try {
+      lastMetadata = ops.current();
+    } catch (NotFoundException e) {
+      LOG.warn(
+          "Failed to load table metadata for table: {}, continuing rename without re-tag", to, e);
+    }
+    Set<Tag> oldTags = Sets.newHashSet();
+    Set<Tag> newTags = Sets.newHashSet();
+    boolean skipNameValidation = awsProperties.glueCatalogSkipNameValidation();
+    if (awsProperties.s3WriteTableTagEnabled()) {
+      oldTags.add(
+          Tag.builder()
+              .key(AwsProperties.S3_TAG_ICEBERG_TABLE)
+              .value(IcebergToGlueConverter.getTableName(from, skipNameValidation))
+              .build());
+      newTags.add(
+          Tag.builder()
+              .key(AwsProperties.S3_TAG_ICEBERG_TABLE)
+              .value(IcebergToGlueConverter.getTableName(to, skipNameValidation))
+              .build());
+    }
+
+    if (awsProperties.s3WriteNamespaceTagEnabled()) {
+      oldTags.add(
+          Tag.builder()
+              .key(AwsProperties.S3_TAG_ICEBERG_NAMESPACE)
+              .value(IcebergToGlueConverter.getDatabaseName(from, skipNameValidation))
+              .build());
+      newTags.add(
+          Tag.builder()
+              .key(AwsProperties.S3_TAG_ICEBERG_NAMESPACE)
+              .value(IcebergToGlueConverter.getDatabaseName(to, skipNameValidation))
+              .build());
+    }
+
+    if (lastMetadata != null && ops.io() instanceof S3FileIO) {
+      updateTableTag((S3FileIO) ops.io(), lastMetadata, oldTags, newTags);
+    }
+  }
+
+  private void updateTableTag(
+      S3FileIO io, TableMetadata metadata, Set<Tag> oldTags, Set<Tag> newTags) {
+    Set<String> manifestListsToUpdate = Sets.newHashSet();
+    Set<ManifestFile> manifestsToUpdate = Sets.newHashSet();
+    for (Snapshot snapshot : metadata.snapshots()) {
+      // add all manifests to the delete set because both data and delete files should be removed
+      Iterables.addAll(manifestsToUpdate, snapshot.allManifests(io));
+      // add the manifest list to the delete set, if present
+      if (snapshot.manifestListLocation() != null) {
+        manifestListsToUpdate.add(snapshot.manifestListLocation());
+      }
+    }
+
+    LOG.info("Manifests to update: {}", Joiner.on(", ").join(manifestsToUpdate));
+
+    boolean gcEnabled =

Review Comment:
   I guess you got the logic from table file deletion, but it does not apply here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org