You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/07/11 15:01:42 UTC

[GitHub] [hudi] leesf commented on a change in pull request #1769: [DOC] Add document for the use of metrics system in Hudi.

leesf commented on a change in pull request #1769:
URL: https://github.com/apache/hudi/pull/1769#discussion_r453203145



##########
File path: docs/_docs/2_8_metrics.md
##########
@@ -0,0 +1,108 @@
+---
+title: Metrics Guide
+keywords: hudi, administration, operation, devops, metrics
+permalink: /docs/metrics.html
+summary: This section offers an overview of metrics in Hudi
+toc: true
+last_modified_at: 2020-06-20T15:59:57-04:00
+---
+
+In this section, We will introduce the metrics and metricsReporter in Hudi. You can view the metrics configuration [here](configurations.html#metrics-configs).
+
+## Metrics
+
+Once the Hudi writer is configured with the right table and environment for metrics, it produces the following graphite metrics, that aid in debugging hudi tables
+
+ - **Commit Duration** - This is amount of time it took to successfully commit a batch of records
+ - **Rollback Duration** - Similarly, amount of time taken to undo partial data left over by a failed commit (happens everytime automatically after a failing write)
+ - **File Level metrics** - Shows the amount of new files added, versions, deleted (cleaned) in each commit
+ - **Record Level Metrics** - Total records inserted/updated etc per commit
+ - **Partition Level metrics** - number of partitions upserted (super useful to understand sudden spikes in commit duration)
+
+These metrics can then be plotted on a standard tool like grafana. Below is a sample commit duration chart.
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_commit_duration.png" alt="hudi_commit_duration.png" style="max-width: 100%" />
+</figure>
+
+## MetricsReporter
+
+MetricsReporter is a interface for report metrics to user specified place. Currently, it's implementations has InMemoryMetricsReporter, JmxMetricsReporter, MetricsGraphiteReporter and DatadogMetricsReporter. Since InMemoryMetricsReporter is only used for testing, we will introduce the other three implementations.
+
+### JmxMetricsReporter
+
+JmxMetricsReporter is a implementation of Jmx reporter, which used to report jmx metric.
+
+#### Configurations
+The following is an example of configuration as JXM. The detailed configuration can refer to [here](configurations.html#jmx).
+
+  ```properties
+  hoodie.metrics.on=true
+  hoodie.metrics.reporter.type=JMX
+  hoodie.metrics.jmx.host=192.168.0.106
+  hoodie.metrics.jmx.port=4001
+  ```
+
+#### Demo
+As configuration above, Hudi metrics will started JMX server on port 4001. Then we can start jconsole to connect to 192.168.0.106:4001. Below is a sample of monitoring hudi jmx metrics through jconsole.
+<figure>
+    <img class="docimage" src="/assets/images/hudi_jxm_metrics.png" alt="hudi_jxm_metrics.png" style="max-width: 100%" />
+</figure>
+
+### MetricsGraphiteReporter
+
+MetricsGraphiteReporter is a implementation of Graphite reporter, which connects to the Graphite server, and send metrics to that server.
+
+#### Configurations
+The following is an example of configuration as GRAPHITE. The detailed configuration can refer to [here](configurations.html#graphite).
+
+  ```properties
+  hoodie.metrics.on=true
+  hoodie.metrics.reporter.type=GRAPHITE
+  hoodie.metrics.graphite.host=192.168.0.106
+  hoodie.metrics.graphite.port=2003
+  hoodie.metrics.graphite.metric.prefix=<your metrics prefix>
+  ```
+#### Demo
+As configuration above, we should first start graphite server on host 192.168.0.106 and port 2003, Hudi metrics will connect to graphite server, and report Hudi metrics to graphite server. Below is a sample of monitoring hudi metrics through graphite.
+  <figure>
+      <img class="docimage" src="/assets/images/hudi_graphite_metrics.png" alt="hudi_graphite_metrics.png" style="max-width: 100%" />
+  </figure>
+
+### DatadogMetricsReporter
+
+DatadogMetricsReporter is a implementation of Datadog reporter.
+A reporter which publishes metric values to Datadog monitoring service via Datadog HTTP API.
+
+#### Configurations
+The following is an example of configuration as Datadog. The detailed configuration can refer to [here](configurations.html#datadog).
+
+```properties
+hoodie.metrics.on=true
+hoodie.metrics.reporter.type=DATADOG
+hoodie.metrics.datadog.api.site=EU # or US
+hoodie.metrics.datadog.api.key=<your api key>
+hoodie.metrics.datadog.metric.prefix=<your metrics prefix>
+```
+
+ * `hoodie.metrics.datadog.api.site` will sets the Datadog API site, It determines whether the requests will be sent to api.datadoghq.eu (EU) or api.datadoghq.com (US). Set this according to your Datadog account settings.

Review comment:
       `will sets the Datadog API site, `-> `will set the Datadog API site.`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org