You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2023/01/07 07:36:23 UTC

[GitHub] [iceberg] singhpk234 opened a new pull request, #6539: [WIP] Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

singhpk234 opened a new pull request, #6539:
URL: https://github.com/apache/iceberg/pull/6539

   ### About the change
   
   Implements the idea proposed in https://github.com/apache/iceberg/issues/6367
   
   ### Testing done
   
   Existing UT's


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6539: Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on code in PR #6539:
URL: https://github.com/apache/iceberg/pull/6539#discussion_r1068566815


##########
core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java:
##########
@@ -264,5 +253,35 @@ public void close() {
           "File groups offered after service was closed, "
               + "they were not successfully committed.");
     }
+
+    private void commitReadyCommitGroups() {
+      if (canCreateCommitGroup()) {

Review Comment:
   I think this is a good way to do things, I was considering having some kind of atomic countdown but I like this as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6539: Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on code in PR #6539:
URL: https://github.com/apache/iceberg/pull/6539#discussion_r1068568029


##########
core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java:
##########
@@ -264,5 +253,35 @@ public void close() {
           "File groups offered after service was closed, "
               + "they were not successfully committed.");
     }
+
+    private void commitReadyCommitGroups() {
+      if (canCreateCommitGroup()) {
+        synchronized (completedRewrites) {
+          if (canCreateCommitGroup()) {
+            String inProgressCommitToken = UUID.randomUUID().toString();
+            Set<RewriteFileGroup> batch = Sets.newHashSetWithExpectedSize(rewritesPerCommit);
+            for (int i = 0; i < rewritesPerCommit && !completedRewrites.isEmpty(); i++) {
+              batch.add(completedRewrites.poll());
+            }

Review Comment:
   I think the synchronization needs to end before you start the commit or else they will happen one at a time correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer merged pull request #6539: Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

Posted by GitBox <gi...@apache.org>.
RussellSpitzer merged PR #6539:
URL: https://github.com/apache/iceberg/pull/6539


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] singhpk234 commented on pull request #6539: Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

Posted by GitBox <gi...@apache.org>.
singhpk234 commented on PR #6539:
URL: https://github.com/apache/iceberg/pull/6539#issuecomment-1380903500

   Sure @jackye1995, presently I didn't add new ut's to this pr as the ut i was thinking of adding : 
   1. make sure a batch is not commited twice 
   2. if number of rewrites are not a multiple of batch size, making sure all the rewrites are already committed 
   were already present as part of existing UT's, Happy to add / think for more UT's if the above cases on top of my mind are not sufficient.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6539: Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

Posted by GitBox <gi...@apache.org>.
singhpk234 commented on code in PR #6539:
URL: https://github.com/apache/iceberg/pull/6539#discussion_r1068675361


##########
core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java:
##########
@@ -264,5 +253,35 @@ public void close() {
           "File groups offered after service was closed, "
               + "they were not successfully committed.");
     }
+
+    private void commitReadyCommitGroups() {
+      if (canCreateCommitGroup()) {
+        synchronized (completedRewrites) {
+          if (canCreateCommitGroup()) {
+            String inProgressCommitToken = UUID.randomUUID().toString();
+            Set<RewriteFileGroup> batch = Sets.newHashSetWithExpectedSize(rewritesPerCommit);
+            for (int i = 0; i < rewritesPerCommit && !completedRewrites.isEmpty(); i++) {
+              batch.add(completedRewrites.poll());
+            }

Review Comment:
   +1, makes sense, making the changes 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #6539: Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on PR #6539:
URL: https://github.com/apache/iceberg/pull/6539#issuecomment-1376762876

   Is there any new test case we can add to ensure correctness of this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6539: Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on code in PR #6539:
URL: https://github.com/apache/iceberg/pull/6539#discussion_r1068570380


##########
core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java:
##########
@@ -264,5 +253,35 @@ public void close() {
           "File groups offered after service was closed, "
               + "they were not successfully committed.");
     }
+
+    private void commitReadyCommitGroups() {
+      if (canCreateCommitGroup()) {
+        synchronized (completedRewrites) {
+          if (canCreateCommitGroup()) {
+            String inProgressCommitToken = UUID.randomUUID().toString();
+            Set<RewriteFileGroup> batch = Sets.newHashSetWithExpectedSize(rewritesPerCommit);
+            for (int i = 0; i < rewritesPerCommit && !completedRewrites.isEmpty(); i++) {
+              batch.add(completedRewrites.poll());
+            }

Review Comment:
   Otherwise if two offers each need to commit, we have to wait for one to finish before the next can start



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6539: Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on code in PR #6539:
URL: https://github.com/apache/iceberg/pull/6539#discussion_r1068569488


##########
core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java:
##########
@@ -264,5 +253,35 @@ public void close() {
           "File groups offered after service was closed, "
               + "they were not successfully committed.");
     }
+
+    private void commitReadyCommitGroups() {
+      if (canCreateCommitGroup()) {
+        synchronized (completedRewrites) {
+          if (canCreateCommitGroup()) {
+            String inProgressCommitToken = UUID.randomUUID().toString();
+            Set<RewriteFileGroup> batch = Sets.newHashSetWithExpectedSize(rewritesPerCommit);
+            for (int i = 0; i < rewritesPerCommit && !completedRewrites.isEmpty(); i++) {
+              batch.add(completedRewrites.poll());
+            }

Review Comment:
   Synchronize (completeRewrites) {
      create batch
   }
   try commit batch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on pull request #6539: Core: Create commit groups in commit service offer in RewriteDataFilesCommitManager

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on PR #6539:
URL: https://github.com/apache/iceberg/pull/6539#issuecomment-1381079523

   Thanks @singhpk234 for the code, and @jackye1995 for the review!
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org