You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2023/01/02 22:50:13 UTC

[GitHub] [iceberg] rdblue commented on issue #6514: Support Conditional Transaction Commits

rdblue commented on issue #6514:
URL: https://github.com/apache/iceberg/issues/6514#issuecomment-1369254153

   @fqaiser94, in general I think this is a good idea, but I'm not sure there are very many use cases for it besides deduplicating high-level operations. I also think that using this as a way to make table properties transactional is probably a bad idea, but it's been requested in the past so we should probably have an approved way to accomplish it.
   
   Table properties purposely don't have transactional guarantees, to avoid using them to coordinate state. Table properties are supposed to be used to configure the table, not to hold important state. What I recommend to accomplish the use case you're talking about is putting the watermark in snapshot properties instead of table properties. That's what we do for Flink commits and we get exactly-once behavior, although the check for the watermark is done outside of the commit path. Concurrent Flink writes would use different watermark properties because they use watermarks that are job-specific.
   
   It's a good idea to provide a custom validation that can do any check you want. For example, your Kafka example could create watermarks based on some chunk of time that is being processed and the custom validation could check the last few snapshots to see whether another process has already committed. That's a good use case.
   
   To do this, I'd probably take a slightly different approach than the one you've implemented. I'd add a `validate(Predicate<TableMetadata> current)` to either `SnapshotUpdate`, or the more general `PendingUpdate`. That way each table operation can have its own custom validation against the current table state. Using a transaction would automatically check all of the custom validations for each operation, so there would be no need to alter `Transaction`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org