You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/12/12 15:33:05 UTC

[GitHub] [iceberg] nastra opened a new pull request, #6404: Core: Allow configuring metrics reporter impl via Catalog property

nastra opened a new pull request, #6404:
URL: https://github.com/apache/iceberg/pull/6404

   In certain cases it makes sense to make the used metrics reporter customizable, so that users have more control over it and how it's reporting


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1046068902


##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -304,7 +306,9 @@ public Table loadTable(SessionContext context, TableIdentifier identifier) {
         new BaseTable(
             ops,
             fullTableName(loadedIdent),
-            report -> reportMetrics(tableIdentifier, report, session::headers));

Review Comment:
   I agree with @nastra on these.
   
   I don't think that the metrics reporter needs to be very flexible because it isn't a user setting or a table setting. It is almost certainly something that will be handled by platform administrators. That means injecting it per catalog is the right thing to do.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1046134055


##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -106,7 +106,7 @@ public class RESTSessionCatalog extends BaseSessionCatalog
   private ResourcePaths paths = null;
   private Object conf = null;
   private FileIO io = null;
-  private MetricsReporter reporter = null;
+  private MetricsReporter customMetricsReporter = null;

Review Comment:
   I was confused for a sec when reading the code I wrote a while ago and thought why this wasn't called `customMetricsReporter` in the first place, hence the change. I can change it back if you think it's not worth renaming this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1045678055


##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -304,7 +306,9 @@ public Table loadTable(SessionContext context, TableIdentifier identifier) {
         new BaseTable(
             ops,
             fullTableName(loadedIdent),
-            report -> reportMetrics(tableIdentifier, report, session::headers));

Review Comment:
   Just distantly related and only for my own information: With the current implementation MetricsReporter could be configured when we start the Catalog and then all the tables within this catalog would use the same MetricsReporter. I'm wondering if it would make sense to make this setting more flexible, like being able to set this even after the Catalog is up and running, and additionally have the ability to use different MetricsReporter implementation for different tables.
   I don't have any specific use-cases in mind, one could be when let's say querying a table becomes pretty slow at some point and we want to switch to another MetricsReporter that could help us analysing the scan metrics to spot the issue, and then switch back to LoggingMetricsReporter once the issue is solved. (Assuming that in the future we gonna have way more scan metrics that could be used for debugging such issues)
   
   This is not meant to be covered in this PR just thinking out loud if it makes sense.



##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -177,7 +177,9 @@ public void initialize(String name, Map<String, String> unresolved) {
     this.io =
         CatalogUtil.loadFileIO(
             ioImpl != null ? ioImpl : ResolvingFileIO.class.getName(), mergedProps, conf);
-    this.reporter = new LoggingMetricsReporter();
+    String metricsReporterImpl = mergedProps.get(CatalogProperties.METRICS_REPORTER_IMPL);

Review Comment:
   Would it make sense to add a test where you configure a custom MetricsReporter through this new catalog property?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #6404:
URL: https://github.com/apache/iceberg/pull/6404


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra closed pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
nastra closed pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property
URL: https://github.com/apache/iceberg/pull/6404


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1045939277


##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -177,7 +177,9 @@ public void initialize(String name, Map<String, String> unresolved) {
     this.io =
         CatalogUtil.loadFileIO(
             ioImpl != null ? ioImpl : ResolvingFileIO.class.getName(), mergedProps, conf);
-    this.reporter = new LoggingMetricsReporter();
+    String metricsReporterImpl = mergedProps.get(CatalogProperties.METRICS_REPORTER_IMPL);

Review Comment:
   makes sense, I added a test



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1046773176


##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -304,7 +306,9 @@ public Table loadTable(SessionContext context, TableIdentifier identifier) {
         new BaseTable(
             ops,
             fullTableName(loadedIdent),
-            report -> reportMetrics(tableIdentifier, report, session::headers));

Review Comment:
   Thanks for sharing your thoughts, @nastra, @rdblue!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1045949398


##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -304,7 +306,9 @@ public Table loadTable(SessionContext context, TableIdentifier identifier) {
         new BaseTable(
             ops,
             fullTableName(loadedIdent),
-            report -> reportMetrics(tableIdentifier, report, session::headers));

Review Comment:
   > like being able to set this even after the Catalog is up and running
   
   I'm not sure this is want we want. I would probably say it's better/easier to just re-configure/re-build the catalog with a different metrics reporter. This is what we do across other catalog properties as well. 
   
   > have the ability to use different MetricsReporter implementation for different tables.
   
   I don't think the UX will be great in such a case where a user would have to re-configure the metrics reporter per table. I think if we'd want to support something like that, then we'd need a strong use case that justifies this.
   
   This is just my personal opinion, but I'm happy to hear what others think on that topic. If there's enough interest/a strong use case then we can re-evaluate.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1046062940


##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -316,7 +320,7 @@ private void reportMetrics(
       TableIdentifier tableIdentifier,
       MetricsReport report,
       Supplier<Map<String, String>> headers) {
-    reporter.report(report);
+    LoggingMetricsReporter.instance().report(report);

Review Comment:
   I think that the logging metrics reporter should be replaced by the injected metrics reporter. Either way, this method should be called. And I think we will need to move the reporter call inside the try/catch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1046056595


##########
api/src/main/java/org/apache/iceberg/metrics/LoggingMetricsReporter.java:
##########
@@ -28,6 +28,11 @@
  */
 public class LoggingMetricsReporter implements MetricsReporter {
   private static final Logger LOG = LoggerFactory.getLogger(LoggingMetricsReporter.class);
+  private static final LoggingMetricsReporter INSTANCE = new LoggingMetricsReporter();

Review Comment:
   This is in two PRs now, right? Should we separate it out into its own so we don't have conflicting PRs?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1046057652


##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -106,7 +106,7 @@ public class RESTSessionCatalog extends BaseSessionCatalog
   private ResourcePaths paths = null;
   private Object conf = null;
   private FileIO io = null;
-  private MetricsReporter reporter = null;
+  private MetricsReporter customMetricsReporter = null;

Review Comment:
   Why change the name? Does it matter how the metrics reporter was created?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#issuecomment-1356901600

   Thanks, @nastra!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1046069256


##########
api/src/main/java/org/apache/iceberg/metrics/LoggingMetricsReporter.java:
##########
@@ -28,6 +28,11 @@
  */
 public class LoggingMetricsReporter implements MetricsReporter {
   private static final Logger LOG = LoggerFactory.getLogger(LoggingMetricsReporter.class);
+  private static final LoggingMetricsReporter INSTANCE = new LoggingMetricsReporter();

Review Comment:
   This is small enough I don't think it's a problem.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #6404: Core: Allow configuring metrics reporter impl via Catalog property

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #6404:
URL: https://github.com/apache/iceberg/pull/6404#discussion_r1046130246


##########
api/src/main/java/org/apache/iceberg/metrics/LoggingMetricsReporter.java:
##########
@@ -28,6 +28,11 @@
  */
 public class LoggingMetricsReporter implements MetricsReporter {
   private static final Logger LOG = LoggerFactory.getLogger(LoggingMetricsReporter.class);
+  private static final LoggingMetricsReporter INSTANCE = new LoggingMetricsReporter();

Review Comment:
   it's in 2 PRs because I thought that the other PR would have been merged by now. Also I thought it's a small enough diff that wouldn't cause any problems for either of the 2 PRs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org