You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by Erik Wright <er...@shopify.com> on 2019/08/07 14:48:02 UTC

Re: Row-level delete sync notes - July 2019

Hi Folks,

I've been on holiday (and will be again next week) but I've started taking
some steps internally to dedicate some engineering time to this project.
Around the last week of August I expect to be able to dedicate some
meaningful time each week to this.

On Thu, Jul 18, 2019 at 3:47 PM Ryan Blue <rb...@netflix.com.invalid> wrote:

> Hi everyone, sorry it took a while for me to get these notes sent out.
> Please reply with discussion or corrections.
>
> *Attendees*:
>
> Ryan Blue
> Anjali Norwood
> Jacques Nadeau
> Anton Okolnychyi
> David Muto
> Erik Wright
> Owen O’Malley
>
> *Topics*:
>
>    - Quick summary of the current approach with sequence numbers
>    - Should global delete be supported?
>    - What is the scope of deletes within a snapshot?
>    - Should synthetic delete files use sequence numbers?
>    - What should be used as a record identifier? Does offset work?
>    - What is the format of a delete diff?
>    - What is the scope of a delete diff in a table?
>    - How will per-file delete diffs work?
>    - Next steps
>
> *DIscussion*:
>
>    - Quick summary: we agree that Iceberg will add sequence numbers to
>    metadata to scope deletes across snapshots (time). Deletes will apply to
>    all all files with an earlier sequence number. Iceberg will use two
>    formats, a synthetic delete diff using file/offset and an equality delete
>    diff using a set of values to match to row data.
>    - Global deletes
>       - Ryan: this is for GDPR. Data layout is for query performance, not
>       delete performance. Deleting a records that could be anywhere should be
>       possible without eagerly scanning all data files in tables that are tens of
>       petabytes
>       - Owen: customers have this use case as well, it should be supported
>       - Erik (I think): these are slow to apply because they probably are
>       not sorted
>       - Jacques: hash-set deletes are not a format constraint, it is an
>       engine constraint
>       - Ryan: we can’t always depend on sorting. That is an optimization,
>       but engines may need to use a hash-set
>       - Owen (I think): table maintenance and delete compaction is
>       important to keep merge costs low
>    - Scope of deletes within a snapshot:
>       - Erik suggested using all the same metadata as data files
>       - Partition data will be used to scope deletes within a snapshot.
>    - Sequence numbers and synthetic delete files:
>       - Anton: will sequence numbers be used?
>       - Ryan: Yes. More ways to eliminate delete diffs that do not need
>       to be applied is good for performance. Simpler to always apply the same
>       rules, too.
>       - Also, reusing files (or un-deleting files) could be a correctness
>       problem.
>    - Synthetic deletes and offsets: What should be used as a record
>    identifier
>       - The confusion was that “offset” could be interpreted as byte
>       offset in a file. The intent was row position. Will use “position” from now
>       on.
>    - What is the format of a delete diff?
>       - Equality deletes: the data columns to match. For example, a, b
>       for a = ? and b = ? with ? filled in by row data
>       - Positional deletes: file name and row position (sparse format,
>       multiple data files covered by in a single delete file)
>       - Jacques (I think): How would a positional delete file apply to
>       just one data file?
>       - Erik: Delete files should also have column lower/upper bounds for
>       the deleted rows. This can be copied and merged for data files to also use
>       stats to eliminate deletes that do not need to be applied
>       - Erik: The latest write-up uses all existing data file metadata
>       fields, unchanged
>       - Ryan: that’s a clever idea and would help performance
>       - It wouldn’t be possible to add a filename field to lower/upper
>       bounds, so this wouldn’t work for scoping to a single file
>       - Should a single file name, list of files names, or bloom filter
>       be added to identify data files? Maybe as a future optimization
>    - How will scope work for global deletes?
>       - Ryan: use a manifest with a different partition spec. The
>       unpartitioned spec for global delete files.
>       - Erik: that could be used to apply deletes to other levels as
>       well. For example, a table partitioned with bucketing could encode deletes
>       with a partition spec that doesn’t include the bucketing level to apply to
>       all buckets.
>       - Ryan: that is difficult because Iceberg would need to decide that
>       one partition contains another to apply diffs across partitions. This
>       complicates scope, so maybe we should only allow global and partition-level
>       for now.
>    - Next steps:
>       - Add preconditions for table format version (done!)
>       - Add sequence numbers to metadata and update the spec (compatible
>       with v1)
>       - Define and document the delete diff formats (Anton and Erik)
>       - Update readers to apply deletes
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Re: Row-level delete sync notes - July 2019

Posted by Ryan Blue <rb...@netflix.com>.
Thanks for the update Erik!

In addition, Anton has opened PR #351
<https://github.com/apache/incubator-iceberg/pull/351> to update the API so
that we can implement eager row-level overwrites. I think that's the only
part that needs to be done for the eager overwrite case because the rest of
the overwrite would be handled by query engines.

I've also opened a few issues and a Row-level Delete Milestone
<https://github.com/apache/incubator-iceberg/milestone/4> to track the
features. Feel free to open more issues or pull requests for what you're
working on and link them to that milestone.

On Wed, Aug 7, 2019 at 7:48 AM Erik Wright <er...@shopify.com> wrote:

> Hi Folks,
>
> I've been on holiday (and will be again next week) but I've started taking
> some steps internally to dedicate some engineering time to this project.
> Around the last week of August I expect to be able to dedicate some
> meaningful time each week to this.
>
> On Thu, Jul 18, 2019 at 3:47 PM Ryan Blue <rb...@netflix.com.invalid>
> wrote:
>
>> Hi everyone, sorry it took a while for me to get these notes sent out.
>> Please reply with discussion or corrections.
>>
>> *Attendees*:
>>
>> Ryan Blue
>> Anjali Norwood
>> Jacques Nadeau
>> Anton Okolnychyi
>> David Muto
>> Erik Wright
>> Owen O’Malley
>>
>> *Topics*:
>>
>>    - Quick summary of the current approach with sequence numbers
>>    - Should global delete be supported?
>>    - What is the scope of deletes within a snapshot?
>>    - Should synthetic delete files use sequence numbers?
>>    - What should be used as a record identifier? Does offset work?
>>    - What is the format of a delete diff?
>>    - What is the scope of a delete diff in a table?
>>    - How will per-file delete diffs work?
>>    - Next steps
>>
>> *DIscussion*:
>>
>>    - Quick summary: we agree that Iceberg will add sequence numbers to
>>    metadata to scope deletes across snapshots (time). Deletes will apply to
>>    all all files with an earlier sequence number. Iceberg will use two
>>    formats, a synthetic delete diff using file/offset and an equality delete
>>    diff using a set of values to match to row data.
>>    - Global deletes
>>       - Ryan: this is for GDPR. Data layout is for query performance,
>>       not delete performance. Deleting a records that could be anywhere should be
>>       possible without eagerly scanning all data files in tables that are tens of
>>       petabytes
>>       - Owen: customers have this use case as well, it should be
>>       supported
>>       - Erik (I think): these are slow to apply because they probably
>>       are not sorted
>>       - Jacques: hash-set deletes are not a format constraint, it is an
>>       engine constraint
>>       - Ryan: we can’t always depend on sorting. That is an
>>       optimization, but engines may need to use a hash-set
>>       - Owen (I think): table maintenance and delete compaction is
>>       important to keep merge costs low
>>    - Scope of deletes within a snapshot:
>>       - Erik suggested using all the same metadata as data files
>>       - Partition data will be used to scope deletes within a snapshot.
>>    - Sequence numbers and synthetic delete files:
>>       - Anton: will sequence numbers be used?
>>       - Ryan: Yes. More ways to eliminate delete diffs that do not need
>>       to be applied is good for performance. Simpler to always apply the same
>>       rules, too.
>>       - Also, reusing files (or un-deleting files) could be a
>>       correctness problem.
>>    - Synthetic deletes and offsets: What should be used as a record
>>    identifier
>>       - The confusion was that “offset” could be interpreted as byte
>>       offset in a file. The intent was row position. Will use “position” from now
>>       on.
>>    - What is the format of a delete diff?
>>       - Equality deletes: the data columns to match. For example, a, b
>>       for a = ? and b = ? with ? filled in by row data
>>       - Positional deletes: file name and row position (sparse format,
>>       multiple data files covered by in a single delete file)
>>       - Jacques (I think): How would a positional delete file apply to
>>       just one data file?
>>       - Erik: Delete files should also have column lower/upper bounds
>>       for the deleted rows. This can be copied and merged for data files to also
>>       use stats to eliminate deletes that do not need to be applied
>>       - Erik: The latest write-up uses all existing data file metadata
>>       fields, unchanged
>>       - Ryan: that’s a clever idea and would help performance
>>       - It wouldn’t be possible to add a filename field to lower/upper
>>       bounds, so this wouldn’t work for scoping to a single file
>>       - Should a single file name, list of files names, or bloom filter
>>       be added to identify data files? Maybe as a future optimization
>>    - How will scope work for global deletes?
>>       - Ryan: use a manifest with a different partition spec. The
>>       unpartitioned spec for global delete files.
>>       - Erik: that could be used to apply deletes to other levels as
>>       well. For example, a table partitioned with bucketing could encode deletes
>>       with a partition spec that doesn’t include the bucketing level to apply to
>>       all buckets.
>>       - Ryan: that is difficult because Iceberg would need to decide
>>       that one partition contains another to apply diffs across partitions. This
>>       complicates scope, so maybe we should only allow global and partition-level
>>       for now.
>>    - Next steps:
>>       - Add preconditions for table format version (done!)
>>       - Add sequence numbers to metadata and update the spec (compatible
>>       with v1)
>>       - Define and document the delete diff formats (Anton and Erik)
>>       - Update readers to apply deletes
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

-- 
Ryan Blue
Software Engineer
Netflix