You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@iceberg.apache.org by Chen Song <ch...@gmail.com> on 2020/06/26 17:39:00 UTC

Iceberg table compaction

Hey

In Iceberg documentation, it mentions to use this for compaction
<https://iceberg.apache.org/spec/#snapshots>. I have a few questions on
compaction.

Is this (replace) referring to this RewriteFiles API
<https://iceberg.apache.org/javadoc/master/org/apache/iceberg/RewriteFiles.html>
?
If so, it looks like it only applies to the most recent snapshot of data?
Is there a way to compact data belonging to old snapshots? e.g., if I want
to rewrite data for older data with newer partition spec?

Thanks for the help in advance.

-- 
Chen Song

Re: Iceberg table compaction

Posted by Chen Song <ch...@gmail.com>.

Thanks Ryan. This makes sense.

To re-iterator what you said,
* Compaction can be done with RewriteFiles API and it creates a new table
snapshot with a new set of files and data content unchanged.
* Garbage collection can be done with ExpireSnapshots API and data files
are deleted during the API calls so long as the reference count is 0.
* Both processes are not automatic and need to be executed explicitly.
e.g., there is no automatic expiration of snapshots based on time.

Thanks,
Chen

On Fri, Jun 26, 2020 at 3:13 PM Ryan Blue <rb...@netflix.com.invalid> wrote:

> Hi Chen,
>
> The "replace" operation indicates that although the files in a table
> changed, the actual table data did not. Queries should produce the same
> results, if they are deterministic. That's why we use it for file
> compaction: although we replace small files with fewer, smaller files, the
> overall contents doesn't change. To your question, yes, that's what the
> RewriteFiles API does.
>
> All operations change the current table state. Each snapshot of a table is
> a complete set of the data files that make up the table, and snapshots are
> immutable. So you can't go back and change a snapshot from yesterday. What
> you can do is replace small files in the current state with a compacted
> large file. That creates a new snapshot that is used from then on. The
> small files are still referenced and available as long as the old snapshot
> exists, which is why snapshots should be cleaned up regularly with
> ExpireSnapshots. That will delete files that are no longer referenced.
>
> We file referenced by old snapshots around for a couple reasons. First,
> readers that started with a different current snapshot may still be reading
> them. Second, it allows you to go back and read the table at an older point
> in time -- time-travel queries.
>
> I hope that helps,
>
> rb
>
> On Fri, Jun 26, 2020 at 10:39 AM Chen Song <ch...@gmail.com> wrote:
>
>> Hey
>>
>> In Iceberg documentation, it mentions to use this for compaction
>> <https://iceberg.apache.org/spec/#snapshots>. I have a few questions on
>> compaction.
>>
>> Is this (replace) referring to this RewriteFiles API
>> <https://iceberg.apache.org/javadoc/master/org/apache/iceberg/RewriteFiles.html>
>> ?
>> If so, it looks like it only applies to the most recent snapshot of data?
>> Is there a way to compact data belonging to old snapshots? e.g., if I want
>> to rewrite data for older data with newer partition spec?
>>
>> Thanks for the help in advance.
>>
>> --
>> Chen Song
>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


-- 
Chen Song

Re: Iceberg table compaction

Posted by Ryan Blue <rb...@netflix.com.INVALID>.

Hi Chen,

The "replace" operation indicates that although the files in a table
changed, the actual table data did not. Queries should produce the same
results, if they are deterministic. That's why we use it for file
compaction: although we replace small files with fewer, smaller files, the
overall contents doesn't change. To your question, yes, that's what the
RewriteFiles API does.

All operations change the current table state. Each snapshot of a table is
a complete set of the data files that make up the table, and snapshots are
immutable. So you can't go back and change a snapshot from yesterday. What
you can do is replace small files in the current state with a compacted
large file. That creates a new snapshot that is used from then on. The
small files are still referenced and available as long as the old snapshot
exists, which is why snapshots should be cleaned up regularly with
ExpireSnapshots. That will delete files that are no longer referenced.

We file referenced by old snapshots around for a couple reasons. First,
readers that started with a different current snapshot may still be reading
them. Second, it allows you to go back and read the table at an older point
in time -- time-travel queries.

I hope that helps,

rb

On Fri, Jun 26, 2020 at 10:39 AM Chen Song <ch...@gmail.com> wrote:

> Hey
>
> In Iceberg documentation, it mentions to use this for compaction
> <https://iceberg.apache.org/spec/#snapshots>. I have a few questions on
> compaction.
>
> Is this (replace) referring to this RewriteFiles API
> <https://iceberg.apache.org/javadoc/master/org/apache/iceberg/RewriteFiles.html>
> ?
> If so, it looks like it only applies to the most recent snapshot of data?
> Is there a way to compact data belonging to old snapshots? e.g., if I want
> to rewrite data for older data with newer partition spec?
>
> Thanks for the help in advance.
>
> --
> Chen Song
>
>

-- 
Ryan Blue
Software Engineer
Netflix