You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by Ryan Blue <rb...@netflix.com.INVALID> on 2018/12/04 00:44:14 UTC

Adding a manifest list file to Snapshots

I’ve been working on a format update lately to address some problems we’ve
been hitting with our large tables:

   - Metadata file size is very large, causing commits to take a long time
   - Tables with millions of files take more than a minute to plan scan
   tasks because so many files are in the table’s manifests
   - Other operations like expiring snapshots that require processing a
   filtered list of files take too long

The first problem is caused by a large number of manifests in a large
number of snapshots written into every metadata file. As the rate of
commits to a table goes up, so does the number of metadata files. Even with
data compaction, each commit is going to copy most of the manifest list and
write it along with all the other valid snapshots. When writing every 2
minutes, this ends up being about 2,000 snapshots over 3 days, each keeping
track of all the current manifests. Moving the manifest list out of the
metadata file is a good way to keep this metadata small with a large number
of valid snapshots.

The other two problems are caused by not having enough metadata about each
manifest to skip or process it, so we end up processing every manifest. For
a table scan, knowing the range of values for each partition field helps
eliminate whole manifests. For snapshot expiration, knowing the number of
deleted files in a snapshot allows skipping manifests with no deletes.

To solve both of these problems, I’ve implemented a spec change that adds a
“manifest-list” in place of the “manifests” list in table metadata. The
manifest list is the location of an Avro file with an entry for every
manifest and all of the metadata needed to compact manifests, filter for
certain operations, and filter for table scans. The details are on PR #21
<https://github.com/apache/incubator-iceberg/pull/21>.

This also depends on tracking partition specs by ID, so that the spec used
to write each manifest is tracked in the manifest list by its ID. That
change is PR #3 <https://github.com/apache/incubator-iceberg/pull/3>.

Since both of these are spec changes, I wanted to highlight them for the
community. Feel free to comment on the PRs or discuss in this thread. I
think the changes are straightforward, and are good to get done before the
first Apache release.

Thanks,

rb
-- 
Ryan Blue
Software Engineer
Netflix