You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/18 17:48:29 UTC

[GitHub] [iceberg] waifairer opened a new pull request, #5299: AWS: Dynamo Catalog: Pass CommitFailedException up the stack without wrapping

waifairer opened a new pull request, #5299:
URL: https://github.com/apache/iceberg/pull/5299

   This resolves a bug that doesn't let the commit retry mechanism kick in. In the dynamo catalog, `checkMetadataLocation` can throw a CommitFailedException.  If this happens, it's caught in
   ```
   } catch (RuntimeException persistFailure) {
   ```
   
   and the commit state is repeatedly checked via `checkCommitStatus`. In workloads with moderate-to-high concurrency, several commits can occur in the background while `checkCommitStatus` is running. If several commits do occur, the history is erased from the DynamoDb`previous_metadata_location` and the CommitFailedException is wrapped into a CommitStateUnknownException. 
   
   This PR catches explicit CommitFailedException (we know we didn't successfully commit) and bubbles it up to the more comprehensive retry logic that checks if the new commit touches partitions that have been recently modified. 
   
   @jackye1995 would appreciate the review on this. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] danielcweeks commented on a diff in pull request #5299: AWS: Dynamo Catalog: Pass CommitFailedException up the stack without wrapping

Posted by GitBox <gi...@apache.org>.
danielcweeks commented on code in PR #5299:
URL: https://github.com/apache/iceberg/pull/5299#discussion_r923884333


##########
aws/src/main/java/org/apache/iceberg/aws/dynamodb/DynamoDbTableOperations.java:
##########
@@ -114,6 +114,9 @@ protected void doCommit(TableMetadata base, TableMetadata metadata) {
       commitStatus = CommitStatus.SUCCESS;
     } catch (ConditionalCheckFailedException e) {
       throw new CommitFailedException(e, "Cannot commit %s: concurrent update detected", tableName());
+    } catch (CommitFailedException e) {

Review Comment:
   I see what this is addressing and I believe this will fix that issue, but I believe the `checkMetadataLocation` call that throws the commit exception directly is unnecessary.  The base location should be checked in the `persistTable` and handled through the ConditionalCheckFailedException.
   
   This `checkMetadataLocation` seems like it is unnecessary, but I guess at a minimum it avoids a dynamo call. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] waifairer commented on a diff in pull request #5299: AWS: Dynamo Catalog: Pass CommitFailedException up the stack without wrapping

Posted by GitBox <gi...@apache.org>.
waifairer commented on code in PR #5299:
URL: https://github.com/apache/iceberg/pull/5299#discussion_r923889681


##########
aws/src/main/java/org/apache/iceberg/aws/dynamodb/DynamoDbTableOperations.java:
##########
@@ -114,6 +114,9 @@ protected void doCommit(TableMetadata base, TableMetadata metadata) {
       commitStatus = CommitStatus.SUCCESS;
     } catch (ConditionalCheckFailedException e) {
       throw new CommitFailedException(e, "Cannot commit %s: concurrent update detected", tableName());
+    } catch (CommitFailedException e) {

Review Comment:
   I agree. checkMetadataLocation appears to be doing a "clean up" for behaviors that should already be handled with 
   ```
             .conditionExpression(DynamoDbCatalog.COL_VERSION + " = :v")
   ```
   
   That said, I'm not sure if `checkMetadataLocation` was added to address a specific issue or if it was done through an abundance of caution. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] danielcweeks commented on pull request #5299: AWS: Dynamo Catalog: Pass CommitFailedException up the stack without wrapping

Posted by GitBox <gi...@apache.org>.
danielcweeks commented on PR #5299:
URL: https://github.com/apache/iceberg/pull/5299#issuecomment-1188425755

   Thanks @waifairer!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] danielcweeks merged pull request #5299: AWS: Dynamo Catalog: Pass CommitFailedException up the stack without wrapping

Posted by GitBox <gi...@apache.org>.
danielcweeks merged PR #5299:
URL: https://github.com/apache/iceberg/pull/5299


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org