You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/04/21 23:39:47 UTC

[GitHub] [iceberg] karuppayya opened a new pull request #2503: Add Spark UI description to Iceberg jobs

karuppayya opened a new pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503


   ![Screen Shot 2021-04-21 at 11 07 55 AM](https://user-images.githubusercontent.com/5082742/115635153-e9ce5980-a2bf-11eb-92f2-de4e5fd044f9.png)
   ![Screen Shot 2021-04-21 at 11 11 22 AM](https://user-images.githubusercontent.com/5082742/115635156-ec30b380-a2bf-11eb-8b13-355ae2977f43.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] karuppayya commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
karuppayya commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r619387227



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,25 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msgs = new ArrayList<>();
+    if (expireOlderThanValue != null) {
+      msgs.add("older_than=" + expireOlderThanValue);
+    }
+    if (retainLastValue != null) {
+      msgs.add("retain_last=" + retainLastValue);
+    }
+    if (expiredSnapshotIds != null) {
+      msgs.add("Snapshot_ids =" + StringUtils.join(expiredSnapshotIds, ","));
+    }
+    return String.format("Expiring snapshots(%s) in %s", StringUtils.join(msgs, ","), table.name());

Review comment:
       The Spark UI description seems to be a fixed width box, it shows what it can and shows the entire msg as tooltip
   ![Screen Shot 2021-04-22 at 8 04 26 PM](https://user-images.githubusercontent.com/5082742/115908205-b81fd480-a41e-11eb-9c04-26b1100c02bc.png)
   Would mentioning the snapshot id of the first and last convey that only two snapshots are deleted?
   Or can we mention the first snapot and mention that there are "n" more snapshots
   ![Screen Shot 2021-04-22 at 8 17 55 PM](https://user-images.githubusercontent.com/5082742/115907913-58c1c480-a41e-11eb-84c6-8529cb3c46fb.png)
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r627846861



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/actions/BaseSnapshotTableSparkAction.java
##########
@@ -109,7 +109,8 @@ public SnapshotTable tableProperty(String property, String value) {
 
   @Override
   public SnapshotTable.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("SNAPSHOT-TABLE", "SNAPSHOT-TABLE");
+    JobGroupInfo info = newJobGroupInfo("SNAPSHOT-TABLE",
+        String.format("Snapshotting table %s as %s", sourceTableIdent().toString(), destTableIdent.toString()));

Review comment:
       Do we need to call toString explicitly? Can we also put the description into a separate var so that this fits on one line?
   
   ```
   String desc = String.format("Snapshotting table %s as %s", sourceTableIdent(), destTableIdent);
   ```

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,32 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", jobDesc());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String jobDesc() {
+    List<String> options = Lists.newArrayList();
+    if (expireOlderThanValue != null) {
+      options.add("older_than=" + expireOlderThanValue);
+    }
+
+    if (retainLastValue != null) {
+      options.add("retain_last=" + retainLastValue);
+    }
+
+    if (!expiredSnapshotIds.isEmpty()) {
+      Long first = expiredSnapshotIds.stream().findFirst().get();
+      if (expiredSnapshotIds.size() > 1) {
+        options.add(String.format("snapshot_ids: %s (%s more...)", first, expiredSnapshotIds.size() - 1));
+      } else {
+        options.add(String.format("snapshot_id: %s", first));
+      }
+    }
+
+    return String.format("Expiring snapshots(%s) in %s", Joiner.on(',').join(options), table.name());

Review comment:
       nit: space before `(`

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,32 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", jobDesc());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String jobDesc() {
+    List<String> options = Lists.newArrayList();

Review comment:
       nit: let's add one more empty line after options




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] karuppayya commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
karuppayya commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r627778317



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRemoveOrphanFilesSparkAction.java
##########
@@ -140,10 +142,19 @@ public BaseRemoveOrphanFilesSparkAction deleteWith(Consumer<String> newDeleteFun
 
   @Override
   public RemoveOrphanFiles.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", "REMOVE-ORPHAN-FILES");
+    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msg = new ArrayList<>();
+    msg.add("older_than=" + olderThanTimestamp);
+    if (location != null) {
+      msg.add("location=" + location);
+    }
+    return String.format("Removing orphan files(%s) from %s", StringUtils.join(msg, ","), table.name());

Review comment:
       I can see that this option is available in org.apache.iceberg.spark.procedures.RemoveOrphanFilesProcedure but not as part of the actions API. Please let me know if I am missing something




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r619547323



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,25 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msgs = new ArrayList<>();
+    if (expireOlderThanValue != null) {
+      msgs.add("older_than=" + expireOlderThanValue);
+    }
+    if (retainLastValue != null) {
+      msgs.add("retain_last=" + retainLastValue);
+    }
+    if (expiredSnapshotIds != null) {
+      msgs.add("Snapshot_ids =" + StringUtils.join(expiredSnapshotIds, ","));
+    }
+    return String.format("Expiring snapshots(%s) in %s", StringUtils.join(msgs, ","), table.name());

Review comment:
       I would have said expiring snapshots from
    X to Y




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r627895522



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,32 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", jobDesc());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String jobDesc() {
+    List<String> options = Lists.newArrayList();
+    if (expireOlderThanValue != null) {
+      options.add("older_than=" + expireOlderThanValue);
+    }
+
+    if (retainLastValue != null) {
+      options.add("retain_last=" + retainLastValue);
+    }
+
+    if (!expiredSnapshotIds.isEmpty()) {
+      Long first = expiredSnapshotIds.stream().findFirst().get();
+      if (expiredSnapshotIds.size() > 1) {
+        options.add(String.format("snapshot_ids: %s (%s more...)", first, expiredSnapshotIds.size() - 1));

Review comment:
       Looking good :) 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r627673224



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,29 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {

Review comment:
       Iceberg usually tries to avoid `get` prefix. What about calling it `jobDesc`? That's short and descriptive enough.

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,29 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msgs = new ArrayList<>();
+    if (expireOlderThanValue != null) {
+      msgs.add("older_than=" + expireOlderThanValue);
+    }
+    if (retainLastValue != null) {
+      msgs.add("retain_last=" + retainLastValue);
+    }
+    if (!expiredSnapshotIds.isEmpty()) {
+      Long first = expiredSnapshotIds.stream().findFirst().get();
+      if (expiredSnapshotIds.size() > 1) {
+        msgs.add(String.format("snapshot_ids: %s(%s more...)", first, expiredSnapshotIds.size() - 1));

Review comment:
       nit: let's add a space before `(`

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,29 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msgs = new ArrayList<>();

Review comment:
       What about calling it options and using Lists?
   
   ```
   List<String> options = Lists.newArrayList();
   ```

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRemoveOrphanFilesSparkAction.java
##########
@@ -140,10 +142,19 @@ public BaseRemoveOrphanFilesSparkAction deleteWith(Consumer<String> newDeleteFun
 
   @Override
   public RemoveOrphanFiles.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", "REMOVE-ORPHAN-FILES");
+    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msg = new ArrayList<>();
+    msg.add("older_than=" + olderThanTimestamp);
+    if (location != null) {
+      msg.add("location=" + location);
+    }
+    return String.format("Removing orphan files(%s) from %s", StringUtils.join(msg, ","), table.name());

Review comment:
       We also need to add dry_run to the description.

##########
File path: spark3/src/main/java/org/apache/iceberg/spark/actions/BaseSnapshotTableSparkAction.java
##########
@@ -109,7 +109,8 @@ public SnapshotTable tableProperty(String property, String value) {
 
   @Override
   public SnapshotTable.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("SNAPSHOT-TABLE", "SNAPSHOT-TABLE");
+    JobGroupInfo info = newJobGroupInfo("SNAPSHOT-TABLE",
+        String.format("Snapshotting table %s(location=%s)", sourceTableIdent().toString(), destTableLocation));

Review comment:
       I think it is important to show the dest table identifier. Also, the location can be null if not provided. Since we are not showing table props and the location can be long, I'll be fine just saying this:
   
   ```
   Snapshotting table %s as %s
   ```

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,29 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msgs = new ArrayList<>();
+    if (expireOlderThanValue != null) {

Review comment:
       Can we add empty lines between these blocks?
   
   ```
       List<String> options = ...
   
       if (expireOlderThanValue != null) {
         ...
       }
   
       if (retainLastValue != null) {
         ....
       }
   
       if (!expiredSnapshotIds.isEmpty()) {
         ...
       }
   
       return ...
   ```

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRemoveOrphanFilesSparkAction.java
##########
@@ -140,10 +142,19 @@ public BaseRemoveOrphanFilesSparkAction deleteWith(Consumer<String> newDeleteFun
 
   @Override
   public RemoveOrphanFiles.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", "REMOVE-ORPHAN-FILES");
+    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {

Review comment:
       Same here, jobDesc, List<String> options, extra spaces, JOINER

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -19,11 +19,14 @@
 
 package org.apache.iceberg.spark.actions;
 
+import java.util.ArrayList;
 import java.util.Iterator;
+import java.util.List;
 import java.util.Set;
 import java.util.concurrent.ExecutorService;
 import java.util.concurrent.atomic.AtomicLong;
 import java.util.function.Consumer;
+import org.apache.commons.lang3.StringUtils;

Review comment:
       I think Iceberg mostly uses `Joiner` from Guava for this purpose.
   
   ```
   private static final Joiner COMMA = Joiner.on(",");
   
   COMMA.join(options)
   ```

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRewriteManifestsSparkAction.java
##########
@@ -142,7 +142,8 @@ public RewriteManifests stagingLocation(String newStagingLocation) {
 
   @Override
   public RewriteManifests.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REWRITE-MANIFESTS", "REWRITE-MANIFESTS");
+    JobGroupInfo info = newJobGroupInfo("REWRITE-MANIFESTS",
+        String.format("Rewriting manifests(staging location=%s) of %s", stagingLocation, table.name()));

Review comment:
       Let's add space before `(`

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRewriteManifestsSparkAction.java
##########
@@ -142,7 +142,8 @@ public RewriteManifests stagingLocation(String newStagingLocation) {
 
   @Override
   public RewriteManifests.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REWRITE-MANIFESTS", "REWRITE-MANIFESTS");
+    JobGroupInfo info = newJobGroupInfo("REWRITE-MANIFESTS",
+        String.format("Rewriting manifests(staging location=%s) of %s", stagingLocation, table.name()));

Review comment:
       Can we move this into a separate var and call it `desc` like below?

##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,29 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msgs = new ArrayList<>();
+    if (expireOlderThanValue != null) {
+      msgs.add("older_than=" + expireOlderThanValue);
+    }
+    if (retainLastValue != null) {
+      msgs.add("retain_last=" + retainLastValue);
+    }
+    if (!expiredSnapshotIds.isEmpty()) {
+      Long first = expiredSnapshotIds.stream().findFirst().get();
+      if (expiredSnapshotIds.size() > 1) {
+        msgs.add(String.format("snapshot_ids: %s(%s more...)", first, expiredSnapshotIds.size() - 1));
+      } else {
+        msgs.add(String.format("snapshot_id: %s", first));
+      }
+    }
+    return String.format("Expiring snapshots(%s) in %s", StringUtils.join(msgs, ","), table.name());

Review comment:
       Can we add a space before `(`?
   
   ```
   Expiring snapshots (%s) in %s
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] karuppayya commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
karuppayya commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r619387323



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRemoveOrphanFilesSparkAction.java
##########
@@ -140,10 +142,20 @@ public BaseRemoveOrphanFilesSparkAction deleteWith(Consumer<String> newDeleteFun
 
   @Override
   public RemoveOrphanFiles.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", "REMOVE-ORPHAN-FILES");
+    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES",
+        String.format("Removing orphan files(%s) from %s", getDescription(), table.name()));

Review comment:
       Fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] karuppayya closed pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
karuppayya closed pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r618756021



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRemoveOrphanFilesSparkAction.java
##########
@@ -140,10 +142,20 @@ public BaseRemoveOrphanFilesSparkAction deleteWith(Consumer<String> newDeleteFun
 
   @Override
   public RemoveOrphanFiles.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", "REMOVE-ORPHAN-FILES");
+    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES",
+        String.format("Removing orphan files(%s) from %s", getDescription(), table.name()));

Review comment:
       Shouldn't this just be "getDescription()"?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] karuppayya commented on pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
karuppayya commented on pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#issuecomment-833892984


   Test failures doesnot seem to be related to the change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r618755393



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,25 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msgs = new ArrayList<>();
+    if (expireOlderThanValue != null) {
+      msgs.add("older_than=" + expireOlderThanValue);
+    }
+    if (retainLastValue != null) {
+      msgs.add("retain_last=" + retainLastValue);
+    }
+    if (expiredSnapshotIds != null) {
+      msgs.add("Snapshot_ids =" + StringUtils.join(expiredSnapshotIds, ","));
+    }
+    return String.format("Expiring snapshots(%s) in %s", StringUtils.join(msgs, ","), table.name());

Review comment:
       Have you tested this with a very large number of snapshots? I think this is a great idea I just worry what happens if I expire say, 500 snapshots, is the UI still usable?
   
   I think maybe it would be sufficient to list the first and last snapshot being expired.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] karuppayya commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
karuppayya commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r627777792



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRemoveOrphanFilesSparkAction.java
##########
@@ -140,10 +142,19 @@ public BaseRemoveOrphanFilesSparkAction deleteWith(Consumer<String> newDeleteFun
 
   @Override
   public RemoveOrphanFiles.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", "REMOVE-ORPHAN-FILES");
+    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] karuppayya commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
karuppayya commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r627778317



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRemoveOrphanFilesSparkAction.java
##########
@@ -140,10 +142,19 @@ public BaseRemoveOrphanFilesSparkAction deleteWith(Consumer<String> newDeleteFun
 
   @Override
   public RemoveOrphanFiles.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", "REMOVE-ORPHAN-FILES");
+    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msg = new ArrayList<>();
+    msg.add("older_than=" + olderThanTimestamp);
+    if (location != null) {
+      msg.add("location=" + location);
+    }
+    return String.format("Removing orphan files(%s) from %s", StringUtils.join(msg, ","), table.name());

Review comment:
       I can see that this option is available in org.apache.iceberg.spark.procedures.RemoveOrphanFilesProcedure but not as part of the actions API. Please let me know if I am missing something @aokolnychyi 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r619547559



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,25 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msgs = new ArrayList<>();
+    if (expireOlderThanValue != null) {
+      msgs.add("older_than=" + expireOlderThanValue);
+    }
+    if (retainLastValue != null) {
+      msgs.add("retain_last=" + retainLastValue);
+    }
+    if (expiredSnapshotIds != null) {
+      msgs.add("Snapshot_ids =" + StringUtils.join(expiredSnapshotIds, ","));
+    }
+    return String.format("Expiring snapshots(%s) in %s", StringUtils.join(msgs, ","), table.name());

Review comment:
       I think mentioning the first snapshot eliminated and how many others are being removed is good to




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r627836067



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseRemoveOrphanFilesSparkAction.java
##########
@@ -140,10 +142,19 @@ public BaseRemoveOrphanFilesSparkAction deleteWith(Consumer<String> newDeleteFun
 
   @Override
   public RemoveOrphanFiles.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", "REMOVE-ORPHAN-FILES");
+    JobGroupInfo info = newJobGroupInfo("REMOVE-ORPHAN-FILES", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msg = new ArrayList<>();
+    msg.add("older_than=" + olderThanTimestamp);
+    if (location != null) {
+      msg.add("location=" + location);
+    }
+    return String.format("Removing orphan files(%s) from %s", StringUtils.join(msg, ","), table.name());

Review comment:
       Oh, yeah, we handle it by passing an empty delete function. Never mind.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] karuppayya commented on a change in pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
karuppayya commented on a change in pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#discussion_r627778778



##########
File path: spark/src/main/java/org/apache/iceberg/spark/actions/BaseExpireSnapshotsSparkAction.java
##########
@@ -181,10 +184,29 @@ public BaseExpireSnapshotsSparkAction deleteWith(Consumer<String> newDeleteFunc)
 
   @Override
   public ExpireSnapshots.Result execute() {
-    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", "EXPIRE-SNAPSHOTS");
+    JobGroupInfo info = newJobGroupInfo("EXPIRE-SNAPSHOTS", getDescription());
     return withJobGroupInfo(info, this::doExecute);
   }
 
+  private String getDescription() {
+    List<String> msgs = new ArrayList<>();

Review comment:
       changed it to `options`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi merged pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
aokolnychyi merged pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2503: Add Spark UI description to Iceberg jobs

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2503:
URL: https://github.com/apache/iceberg/pull/2503#issuecomment-833970309


   @RussellSpitzer, could you take one more look too?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org