You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iceberg.apache.org by bl...@apache.org on 2018/12/12 21:50:07 UTC

[incubator-iceberg] branch master updated: Do not scan manifests with no deletes when expiring. (#46)

This is an automated email from the ASF dual-hosted git repository.

blue pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-iceberg.git


The following commit(s) were added to refs/heads/master by this push:
     new 26a3e4a  Do not scan manifests with no deletes when expiring. (#46)
26a3e4a is described below

commit 26a3e4afc05d09556748342888278439a30b455b
Author: Ryan Blue <rd...@users.noreply.github.com>
AuthorDate: Wed Dec 12 13:50:01 2018 -0800

    Do not scan manifests with no deletes when expiring. (#46)
---
 core/src/main/java/com/netflix/iceberg/RemoveSnapshots.java | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/core/src/main/java/com/netflix/iceberg/RemoveSnapshots.java b/core/src/main/java/com/netflix/iceberg/RemoveSnapshots.java
index 541cc5f..4784dd1 100644
--- a/core/src/main/java/com/netflix/iceberg/RemoveSnapshots.java
+++ b/core/src/main/java/com/netflix/iceberg/RemoveSnapshots.java
@@ -162,8 +162,11 @@ class RemoveSnapshots implements ExpireSnapshots {
         .onFailure((item, exc) ->
             LOG.warn("Failed to get deleted files: this may cause orphaned data files", exc)
         ).run(manifest -> {
-          // even if the manifest is still used, it may contain files that can be deleted
-          // TODO: eliminate manifests with no deletes without scanning
+          if (manifest.deletedFilesCount() != null && manifest.deletedFilesCount() == 0) {
+            return;
+          }
+
+          // the manifest has deletes, scan it to find files to delete
           try (ManifestReader reader = ManifestReader.read(ops.io().newInputFile(manifest.path()))) {
             for (ManifestEntry entry : reader.entries()) {
               // if the snapshot ID of the DELETE entry is no longer valid, the data can be deleted