You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/08/11 14:42:33 UTC

[GitHub] [iceberg] szlta opened a new issue, #5507: SnapshotManager can't handle rollback calls that would pick the current snapshot as target snapshot.

szlta opened a new issue, #5507:
URL: https://github.com/apache/iceberg/issues/5507

   ### Apache Iceberg version
   
   0.14.0 (latest release)
   
   ### Query engine
   
   _No response_
   
   ### Please describe the bug 🐞
   
   Example test case for `TestSnapshotManager.java`:
   ```
     @Test
     public void testAttemptToRollbackToCurrentSnapshot() {
       table.newAppend().appendFile(FILE_A).commit();
   
       long currentSnapshotTimestampPlus100 = table.currentSnapshot().timestampMillis() + 100;
       table
           .manageSnapshots()
           .rollbackToTime(currentSnapshotTimestampPlus100)
           .commit();
   
       long currentSnapshotId = table.currentSnapshot().snapshotId();
       table
           .manageSnapshots()
           .rollbackTo(currentSnapshotId)
           .commit();
     }
   ```
   The above case fails with 
   ```
   Cannot commit transaction: last operation has not committed
   java.lang.IllegalStateException: Cannot commit transaction: last operation has not committed
   	at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkState(Preconditions.java:502)
   	at org.apache.iceberg.BaseTransaction.commitTransaction(BaseTransaction.java:252)
   	at org.apache.iceberg.SnapshotManager.commit(SnapshotManager.java:158)
   	...
   ```
   I think in such cases the API should still return without errors, even if the request turned out to be a no-op.
   This was working well in 0.13.0, but since 0.14.0 it is failing due to the refactor in this commit: https://github.com/apache/iceberg/commit/e41e8de673628baa6d641f5ad6b03680c18cc0fe
   
   The reason for the failure is that `SetSnapshotOperation#commit()` doesn't commit on `taskOps` at https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/SetSnapshotOperation.java#L126-L137 leaving  https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/BaseTransaction.java#L105 in false state and eventually causing the transaction commit to throw the above exception.
   
   IMHO there isn't a good reason for `SetSnapshotOperation#commit()` to return without committing in such cases as the commitTransaction() invocation would do a similar logic later anyway.
   So I think the solution would be to remove this "return if no changes" logic from `SetSnapshotOperation`. @rdblue - as the original contributor, what's your opinion?
   cc: @lcspinter @pvary 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on issue #5507: SnapshotManager can't handle rollback calls that would pick the current snapshot as target snapshot.

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #5507:
URL: https://github.com/apache/iceberg/issues/5507#issuecomment-1215687165

   It seems reasonable to set `lastOperationCommitted` in the path where we skip an actual commit. Should we try that and see if it fixes it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on issue #5507: SnapshotManager can't handle rollback calls that would pick the current snapshot as target snapshot.

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #5507:
URL: https://github.com/apache/iceberg/issues/5507#issuecomment-1215695529

   I just merged PR #5536 to fix this. That fix makes sense to me because we can always avoid swapping identical metadata in `TableOperations` rather than not calling `commit`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] amogh-jahagirdar commented on issue #5507: SnapshotManager can't handle rollback calls that would pick the current snapshot as target snapshot.

Posted by GitBox <gi...@apache.org>.
amogh-jahagirdar commented on issue #5507:
URL: https://github.com/apache/iceberg/issues/5507#issuecomment-1213444321

   Yeah I think it makes sense to commit the SetSnapshotOperation. I remember a discussion around this came up in https://github.com/apache/iceberg/pull/4071#discussion_r837008625  for updating snapshot references. We commit the operation even if there are no changes.
   
   Another way I was thinking was in BaseTransaction we https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/BaseTransaction.java#L502 we just advance the "lastOperationCommitted" flag in the case there is no intermediate snapshots or metadata updates. That way we avoid having to do a commit in the first place because it seems as though the original concern is the extra call required for commit? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szlta commented on issue #5507: SnapshotManager can't handle rollback calls that would pick the current snapshot as target snapshot.

Posted by GitBox <gi...@apache.org>.
szlta commented on issue #5507:
URL: https://github.com/apache/iceberg/issues/5507#issuecomment-1214966766

   Thanks all for the feedback here. I think [as referenced] since `UpdateSnapshotReferenceOperation` is comitting empty operations in a similar way already, we should make `SetSnapshotOperation` behave the same way. I think the empty commit in such cases will not cause noticeable performance degradation, and other than that I don't see a good reason why to keep that early `return` statement.
   
   I've opened a PR to make it happen at:
   https://github.com/apache/iceberg/pull/5536


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue closed issue #5507: SnapshotManager can't handle rollback calls that would pick the current snapshot as target snapshot.

Posted by GitBox <gi...@apache.org>.
rdblue closed issue #5507: SnapshotManager can't handle rollback calls that would pick the current snapshot as target snapshot.
URL: https://github.com/apache/iceberg/issues/5507


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on issue #5507: SnapshotManager can't handle rollback calls that would pick the current snapshot as target snapshot.

Posted by GitBox <gi...@apache.org>.
kbendick commented on issue #5507:
URL: https://github.com/apache/iceberg/issues/5507#issuecomment-1213412388

   Since the above refactoring has a transaction being used, it seems like we need to discard the changes and/or commit the transaction (possibly as a no-op).
   
   This does seem like something that should be allowed to proceed though, unless there's a reason not to allow it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org