You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/10/01 01:02:03 UTC

[GitHub] [iceberg] rdblue edited a comment on pull request #1499: Update the Iceberg spec for row-level deletes

rdblue edited a comment on pull request #1499:
URL: https://github.com/apache/iceberg/pull/1499#issuecomment-701725686


   @electrum, it sounds like there are two parts to your concerns that I'll address. First, what is the use of restricting the scope of equality deletes? And second, why not use predicates for global deletes instead of equality files?
   
   > What doesn't make sense is recording the file name. . . . I actually can't think of any reason why we'd want to restrict it to a specific file.
   
   Equality deletes aren't restricted to a specific file. The position deletes remove a particular row in a data file, but equality deletes are like predicates applied to entire partitions.
   
   There's good reason to be able to scope an equality delete to a partition. If we accumulated all of the equality deletes globally, then every data file could potentially have a huge number of deletes to apply. For example, consider a CDC stream that is being written to a table that changes rows through `UPSERT` operations by some row ID. If the table is also bucketed by that row ID, then we can easily reduce the number of deletes than need to be applied to the average data file by a couple orders of magnitude by restricting the scope of a delete to just one partition. This allows us to accumulate and efficiently handle more changes as deletes before it is necessary to compact.
   
   In addition, because deletes are scoped by sequence number, we won't always be able to compact equality delete files together. If all of the deletes were at the table level, then there would necessarily be a lot more delete files to open per data file, even in a maintained table. That puts pressure on compacting deletes into data files, even if there are very few deletes. Again, scoping by partition helps us avoid this.
   
   > Basic range filters would cover common use cases, such as "delete all data older than X".
   
   I think this suggestion was mainly motivated by the idea that all equality deletes could be globally scoped, but I think it's a good suggestion for global deletes. It would be really nice to be able to encode deletes like this for certain use cases. And global equality deletes are already a special case. Maybe we should make a way to encode expressions.
   
   My main concerns with this are around overhead. If global deletes are equality deletes, then we can do some amount of work to determine which partitions match and re-encode the delete against those partitions. For example, in the `id = 5` case, even if the table is partitioned by time instead of id, I can go find the hours where the id was active based on lower/upper bounds in data files. That would work for lots of tables, where the record ids are correlated with time. But wouldn't be able to rewrite to more targeted deletes if any other expression were used.
   
   We could store expressions instead of equality deletes everywhere to be able to do the rewrite, but that quickly turns into something horrible to apply at read time. A nice thing about equality deletes is that there is a small set of possible schemas (subsets of table columns) and delete files can be combined by unioning filter sets (or merging if rows are ordered).
   
   I think for now I'd opt not to add global delete filters, but if you think it is a good idea I'd be happy to discuss it more.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org