You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/02/26 06:26:00 UTC

[GitHub] [hudi] nbalajee opened a new pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

nbalajee opened a new pull request #2607:
URL: https://github.com/apache/hudi/pull/2607


   …tors
   
   ## What is the purpose of the pull request
   Frame work for collecting Hudi Observability stats from the executors.
   
   ## Brief change log
   
   - Using distributed registry, report stats from the executors to the driver, to be published using the Graphite reporter.
   - Report Hudi Write stage performance stats.
   - Report Hudi BoomIndex stage stats.
   
   ## Verify this pull request
   
   This change added tests and can be verified as follows:
   
     -  Added a unit testcase testObservabilityMetricsOnCOW
     -  Manually verified the change by running a job locally.
   
   ## Committer checklist
   
    - [ x] Has a corresponding JIRA in PR title & commit
    
    - [ x] Commit message is descriptive of the change
    
    - [ x] CI is green
   
    - [x ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nbalajee commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
nbalajee commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r612022146



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##########
@@ -196,6 +200,10 @@ public IOType getIOType() {
       stat.setRuntimeStats(runtimeStats);

Review comment:
       reused the timer.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nbalajee commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
nbalajee commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r597098674



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieObservabilityStat.java
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.model;
+
+import org.apache.hudi.common.metrics.Registry;
+import org.apache.hudi.common.util.Option;
+
+import java.io.Serializable;
+
+/**
+ * Observability related metrics collection operations.
+ */
+public class HoodieObservabilityStat implements Serializable {
+  public static final String OBSERVABILITY_REGISTRY_NAME = "Observability";
+
+  // define a unique metric name string for each metric to be collected.
+  public static final String PARQUET_NORMALIZED_WRITE_TIME = "writeTimePerRecordInUSec";
+  public static final String PARQUET_CUMULATIVE_WRITE_TIME = "cumulativeParquetWriteTimeInUSec";
+  public static final String PARQUET_WRITE_TIME_PER_MB_IN_USEC = "writeTimePerMBInUSec";
+  public static final String PARQUET_WRITE_THROUGHPUT_MBPS = "writeThroughputMBps";
+  public static final String TOTAL_RECORDS_WRITTEN = "totalRecordsWritten";
+
+  public enum WriteType {
+    INSERT,
+    UPSERT,
+    UPDATE,
+  }
+
+  public static long ONE_MB = 1024 * 1024;
+  public static long USEC_PER_SEC = 1000 * 1000;
+  Option<Registry> observabilityRegistry;
+  String tableName;
+  WriteType writeType;
+  String hostName;
+  Long stageId;
+  Long partitionId;
+
+  public HoodieObservabilityStat(Option<Registry> registry, String tableName, WriteType type, String host,
+                                 long stageId, long partitionId) {
+    this.observabilityRegistry = registry;
+    this.tableName = tableName;
+    this.writeType = type;
+    this.hostName = host;
+    this.stageId = stageId;
+    this.partitionId = partitionId;
+  }
+
+  private String getWriteMetricWithTypeHostNameAndPartitionId(String tableName, String metric, String type,
+                                                              String host, long partitionId) {
+    return String.format("%s.%s.%s.%s.%d", tableName, metric, type, host, partitionId);
+  }
+
+  private String getConsolidatedWriteMetricKey(String tableName, String metric, String type) {
+    return String.format("%s.consolidated.%s.%s", tableName, metric, type);
+  }
+
+  public void recordWriteStats(long totalRecs, long cumulativeWriteTimeInMsec, long fileSizeInBytes) {

Review comment:
       Looked into this.  Yes, by overriding the add/merge functions in HoodieObservabilityMetrics we can merge the executor level stats into consolidated stats. 
   
   However, doing it this way provides flexibility and more control over granularity.   For example, aggregating over <tablename>.consolidated.<metrics>.INSERT or UPSERT operations.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-961587611


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "189503c28d07f874827e91479d78a4869d8b491c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084",
       "triggerID" : "189503c28d07f874827e91479d78a4869d8b491c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 189503c28d07f874827e91479d78a4869d8b491c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-914648067


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "189503c28d07f874827e91479d78a4869d8b491c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084",
       "triggerID" : "189503c28d07f874827e91479d78a4869d8b491c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 189503c28d07f874827e91479d78a4869d8b491c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nbalajee commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
nbalajee commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r595527355



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
##########
@@ -143,6 +143,9 @@
   public static final String CLIENT_HEARTBEAT_NUM_TOLERABLE_MISSES_PROP = "hoodie.client.heartbeat.tolerable.misses";
   public static final Integer DEFAULT_CLIENT_HEARTBEAT_NUM_TOLERABLE_MISSES = 2;
 
+  public static final String COLLECT_OBSERVABILITY_METRICS = "hoodie.collect.observability.metrics";
+  public static final String DEFAULT_COLLECT_OBSERVABILITY_METRICS = "true";

Review comment:
       Done

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java
##########
@@ -377,9 +388,21 @@ public void write(HoodieRecord record, Option<IndexedRecord> insertValue) {
         writer.close();
 
         // update final size, once for all log files
+        long totalLogFileSize = 0;
         for (WriteStatus status: statuses) {
           long logFileSize = FSUtils.getFileSize(fs, new Path(config.getBasePath(), status.getStat().getPath()));
           status.getStat().setFileSizeInBytes(logFileSize);
+          totalLogFileSize += logFileSize;
+        }
+
+        if (config.isMetricsOn() && config.shouldCollectObservabilityMetrics()) {

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nbalajee commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
nbalajee commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r595527836



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##########
@@ -113,13 +121,15 @@ public void write(HoodieRecord record, Option<IndexedRecord> avroRecord) {
     Option recordMetadata = record.getData().getMetadata();
     try {
       if (avroRecord.isPresent()) {
+        writeTimer.startTimer();

Review comment:
       Fixed.

##########
File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java
##########
@@ -70,25 +72,32 @@
     AbstractHoodieWriteClient<T, JavaRDD<HoodieRecord<T>>, JavaRDD<HoodieKey>, JavaRDD<WriteStatus>> {
 
   private static final Logger LOG = LogManager.getLogger(SparkRDDWriteClient.class);
+  protected final transient HoodieObservabilityMetrics observabilityMetrics;
 
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig clientConfig) {
-    super(context, clientConfig);
+    this(context, clientConfig, Option.empty());
   }
 
   @Deprecated
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig writeConfig, boolean rollbackPending) {
-    super(context, writeConfig);
+    this(context, writeConfig, rollbackPending, Option.empty());
   }
 
   @Deprecated
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig writeConfig, boolean rollbackPending,
                              Option<EmbeddedTimelineService> timelineService) {
     super(context, writeConfig, timelineService);
+    this.observabilityMetrics = (HoodieObservabilityMetrics) Registry.getRegistry(
+        HoodieObservabilityStat.OBSERVABILITY_REGISTRY_NAME, HoodieObservabilityMetrics.class.getName());
+    observabilityMetrics.registerWithSpark(context, config);
   }
 
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig writeConfig,
                              Option<EmbeddedTimelineService> timelineService) {
     super(context, writeConfig, timelineService);
+    this.observabilityMetrics = (HoodieObservabilityMetrics) Registry.getRegistry(

Review comment:
       Moved.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r599791391



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##########
@@ -196,6 +200,10 @@ public IOType getIOType() {
       stat.setRuntimeStats(runtimeStats);

Review comment:
       We already have these `runtimeStats`, can we use this for write times instead ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-914648067


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "189503c28d07f874827e91479d78a4869d8b491c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084",
       "triggerID" : "189503c28d07f874827e91479d78a4869d8b491c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 189503c28d07f874827e91479d78a4869d8b491c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-961587611


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "189503c28d07f874827e91479d78a4869d8b491c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084",
       "triggerID" : "189503c28d07f874827e91479d78a4869d8b491c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 189503c28d07f874827e91479d78a4869d8b491c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-786454326


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=h1) Report
   > Merging [#2607](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=desc) (0194bdc) into [master](https://codecov.io/gh/apache/hudi/commit/b038623ed3318404f9bc4707f005a9fc458c0adf?el=desc) (b038623) will **decrease** coverage by `7.88%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2607/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2607      +/-   ##
   ============================================
   - Coverage     52.04%   44.15%   -7.89%     
   + Complexity     3580     2879     -701     
   ============================================
     Files           466      410      -56     
     Lines         22325    19282    -3043     
     Branches       2379     2033     -346     
   ============================================
   - Hits          11619     8514    -3105     
   - Misses         9692    10066     +374     
   + Partials       1014      702     -312     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `51.11% <0.00%> (-0.19%)` | `0.00 <0.00> (ø)` | |
   | hudiflink | `53.96% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.52% <ø> (-59.96%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...che/hudi/common/model/HoodieObservabilityStat.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZU9ic2VydmFiaWxpdHlTdGF0LmphdmE=) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | ... and [89 more](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nbalajee commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
nbalajee commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r595527683



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##########
@@ -60,6 +65,8 @@
   protected long recordsDeleted = 0;
   private Map<String, HoodieRecord<T>> recordMap;
   private boolean useWriterSchema = false;
+  private HoodieTimer writeTimer = null;

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
n3nash commented on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-851696762


   @nsivabalan This PR is being tested internally right now. Let's leave it in WIP and come back to this later. I don't think this will be addressed in the next week. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nbalajee commented on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
nbalajee commented on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-818882851


   > 1. Logical vs Physical Space
          - With RFC-15 this information can be reported from metdata.  (before RFC-15, may take additional listing)
   > 2. CPU used per record (CPU usage of this job ?)
          - cumulative sum of time taken translates to vcore seconds.
   > 3. Mem used per record (Mem usage of this job ?)
          - cumulative sum of time taken translates to memory seconds (timetaken * {mem allocated per executor} by default.  Can be refined with actual mem usage stats).
   > 4. Time taken per record
           - write stage time
           - bloom index stage time 
           are collected per executor.  
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-914648067


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "189503c28d07f874827e91479d78a4869d8b491c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "189503c28d07f874827e91479d78a4869d8b491c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 189503c28d07f874827e91479d78a4869d8b491c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-786454326


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=h1) Report
   > Merging [#2607](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=desc) (84ce9ba) into [master](https://codecov.io/gh/apache/hudi/commit/1277c62398c58690cd5a6aa78048335d5313ca05?el=desc) (1277c62) will **decrease** coverage by `0.07%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2607/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2607      +/-   ##
   ============================================
   - Coverage     51.73%   51.66%   -0.08%     
     Complexity     3594     3594              
   ============================================
     Files           475      476       +1     
     Lines         22525    22552      +27     
     Branches       2402     2403       +1     
   ============================================
   - Hits          11654    11652       -2     
   - Misses         9857     9885      +28     
   - Partials       1014     1015       +1     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `50.79% <0.00%> (-0.16%)` | `0.00 <0.00> (ø)` | |
   | hudiflink | `54.07% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `70.93% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `45.70% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `69.94% <ø> (ø)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...pache/hudi/common/model/HoodieExecutorMetrics.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUV4ZWN1dG9yTWV0cmljcy5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | `26.00% <0.00%> (ø%)` | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-914648067


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "189503c28d07f874827e91479d78a4869d8b491c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084",
       "triggerID" : "189503c28d07f874827e91479d78a4869d8b491c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 189503c28d07f874827e91479d78a4869d8b491c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-914648067


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "189503c28d07f874827e91479d78a4869d8b491c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084",
       "triggerID" : "189503c28d07f874827e91479d78a4869d8b491c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 189503c28d07f874827e91479d78a4869d8b491c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r586940559



##########
File path: hudi-client/hudi-client-common/pom.xml
##########
@@ -195,6 +195,11 @@
       <artifactId>junit-platform-commons</artifactId>
       <scope>test</scope>
     </dependency>
+      <dependency>

Review comment:
       Please remove this dependency




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-786454326


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=h1) Report
   > Merging [#2607](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=desc) (84ce9ba) into [master](https://codecov.io/gh/apache/hudi/commit/1277c62398c58690cd5a6aa78048335d5313ca05?el=desc) (1277c62) will **decrease** coverage by `1.57%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2607/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2607      +/-   ##
   ============================================
   - Coverage     51.73%   50.16%   -1.58%     
   + Complexity     3594     3209     -385     
   ============================================
     Files           475      418      -57     
     Lines         22525    19374    -3151     
     Branches       2402     2045     -357     
   ============================================
   - Hits          11654     9718    -1936     
   + Misses         9857     8833    -1024     
   + Partials       1014      823     -191     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `50.79% <0.00%> (-0.16%)` | `0.00 <0.00> (ø)` | |
   | hudiflink | `54.07% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `69.94% <ø> (ø)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...pache/hudi/common/model/HoodieExecutorMetrics.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUV4ZWN1dG9yTWV0cmljcy5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | `26.00% <0.00%> (ø%)` | |
   | [...udi/timeline/service/handlers/TimelineHandler.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvVGltZWxpbmVIYW5kbGVyLmphdmE=) | | | |
   | [...c/main/java/org/apache/hudi/dla/DLASyncConfig.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zeW5jL2h1ZGktZGxhLXN5bmMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZGxhL0RMQVN5bmNDb25maWcuamF2YQ==) | | | |
   | [...org/apache/hudi/spark3/internal/DefaultSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9EZWZhdWx0U291cmNlLmphdmE=) | | | |
   | [...src/main/java/org/apache/hudi/QuickstartUtils.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvUXVpY2tzdGFydFV0aWxzLmphdmE=) | | | |
   | [...spark/src/main/scala/org/apache/hudi/package.scala](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL3BhY2thZ2Uuc2NhbGE=) | | | |
   | [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | | | |
   | [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVCdWxrSW5zZXJ0RGF0YUludGVybmFsV3JpdGVyRmFjdG9yeS5qYXZh) | | | |
   | [.../apache/hudi/hive/MultiPartKeysValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTXVsdGlQYXJ0S2V5c1ZhbHVlRXh0cmFjdG9yLmphdmE=) | | | |
   | ... and [51 more](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree-more) | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-786454326


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=h1) Report
   > Merging [#2607](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=desc) (0194bdc) into [master](https://codecov.io/gh/apache/hudi/commit/b038623ed3318404f9bc4707f005a9fc458c0adf?el=desc) (b038623) will **decrease** coverage by `5.34%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2607/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2607      +/-   ##
   ============================================
   - Coverage     52.04%   46.69%   -5.35%     
   + Complexity     3580     3264     -316     
   ============================================
     Files           466      467       +1     
     Lines         22325    22362      +37     
     Branches       2379     2380       +1     
   ============================================
   - Hits          11619    10442    -1177     
   - Misses         9692    11026    +1334     
   + Partials       1014      894     -120     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `51.11% <0.00%> (-0.19%)` | `0.00 <0.00> (ø)` | |
   | hudiflink | `53.96% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `49.62% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `9.52% <ø> (-59.96%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...che/hudi/common/model/HoodieObservabilityStat.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZU9ic2VydmFiaWxpdHlTdGF0LmphdmE=) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | ... and [32 more](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] prashantwason commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
prashantwason commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r591853011



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java
##########
@@ -377,9 +388,21 @@ public void write(HoodieRecord record, Option<IndexedRecord> insertValue) {
         writer.close();
 
         // update final size, once for all log files
+        long totalLogFileSize = 0;
         for (WriteStatus status: statuses) {
           long logFileSize = FSUtils.getFileSize(fs, new Path(config.getBasePath(), status.getStat().getPath()));
           status.getStat().setFileSizeInBytes(logFileSize);
+          totalLogFileSize += logFileSize;
+        }
+
+        if (config.isMetricsOn() && config.shouldCollectObservabilityMetrics()) {

Review comment:
       This code is repeated across handles. So probably move it within the table itself and simplify the check for metrics enabled.
   
   hoodieTable.updateObservabilityMetrics(xxx)
   
   or 
   
   hoodieTable.getObservabilityMetrics().ifPresent(....)
   

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##########
@@ -60,6 +65,8 @@
   protected long recordsDeleted = 0;
   private Map<String, HoodieRecord<T>> recordMap;
   private boolean useWriterSchema = false;
+  private HoodieTimer writeTimer = null;

Review comment:
       Used in all handles so maybe move it to the base class.

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
##########
@@ -143,6 +143,9 @@
   public static final String CLIENT_HEARTBEAT_NUM_TOLERABLE_MISSES_PROP = "hoodie.client.heartbeat.tolerable.misses";
   public static final Integer DEFAULT_CLIENT_HEARTBEAT_NUM_TOLERABLE_MISSES = 2;
 
+  public static final String COLLECT_OBSERVABILITY_METRICS = "hoodie.collect.observability.metrics";
+  public static final String DEFAULT_COLLECT_OBSERVABILITY_METRICS = "true";

Review comment:
       default false is better unless we are use this does not cause scalability issues with the reporter/metrics platform.

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##########
@@ -113,13 +121,15 @@ public void write(HoodieRecord record, Option<IndexedRecord> avroRecord) {
     Option recordMetadata = record.getData().getMetadata();
     try {
       if (avroRecord.isPresent()) {
+        writeTimer.startTimer();

Review comment:
       If an exception is thrown, the endTimer is not called. Is is ok to call multiple startTimer() without endTimer()?

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieObservabilityStat.java
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.model;
+
+import org.apache.hudi.common.metrics.Registry;
+import org.apache.hudi.common.util.Option;
+
+import java.io.Serializable;
+
+/**
+ * Observability related metrics collection operations.
+ */
+public class HoodieObservabilityStat implements Serializable {
+  public static final String OBSERVABILITY_REGISTRY_NAME = "Observability";
+
+  // define a unique metric name string for each metric to be collected.
+  public static final String PARQUET_NORMALIZED_WRITE_TIME = "writeTimePerRecordInUSec";
+  public static final String PARQUET_CUMULATIVE_WRITE_TIME = "cumulativeParquetWriteTimeInUSec";
+  public static final String PARQUET_WRITE_TIME_PER_MB_IN_USEC = "writeTimePerMBInUSec";
+  public static final String PARQUET_WRITE_THROUGHPUT_MBPS = "writeThroughputMBps";
+  public static final String TOTAL_RECORDS_WRITTEN = "totalRecordsWritten";
+
+  public enum WriteType {
+    INSERT,
+    UPSERT,
+    UPDATE,
+  }
+
+  public static long ONE_MB = 1024 * 1024;
+  public static long USEC_PER_SEC = 1000 * 1000;
+  Option<Registry> observabilityRegistry;
+  String tableName;
+  WriteType writeType;
+  String hostName;
+  Long stageId;
+  Long partitionId;
+
+  public HoodieObservabilityStat(Option<Registry> registry, String tableName, WriteType type, String host,
+                                 long stageId, long partitionId) {
+    this.observabilityRegistry = registry;
+    this.tableName = tableName;
+    this.writeType = type;
+    this.hostName = host;
+    this.stageId = stageId;
+    this.partitionId = partitionId;
+  }
+
+  private String getWriteMetricWithTypeHostNameAndPartitionId(String tableName, String metric, String type,
+                                                              String host, long partitionId) {
+    return String.format("%s.%s.%s.%s.%d", tableName, metric, type, host, partitionId);
+  }
+
+  private String getConsolidatedWriteMetricKey(String tableName, String metric, String type) {
+    return String.format("%s.consolidated.%s.%s", tableName, metric, type);
+  }
+
+  public void recordWriteStats(long totalRecs, long cumulativeWriteTimeInMsec, long fileSizeInBytes) {

Review comment:
       Can this be done within the HoodieObservabilityMetrics while merging the updates from each executor?
   
   

##########
File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java
##########
@@ -70,25 +72,32 @@
     AbstractHoodieWriteClient<T, JavaRDD<HoodieRecord<T>>, JavaRDD<HoodieKey>, JavaRDD<WriteStatus>> {
 
   private static final Logger LOG = LogManager.getLogger(SparkRDDWriteClient.class);
+  protected final transient HoodieObservabilityMetrics observabilityMetrics;
 
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig clientConfig) {
-    super(context, clientConfig);
+    this(context, clientConfig, Option.empty());
   }
 
   @Deprecated
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig writeConfig, boolean rollbackPending) {
-    super(context, writeConfig);
+    this(context, writeConfig, rollbackPending, Option.empty());
   }
 
   @Deprecated
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig writeConfig, boolean rollbackPending,
                              Option<EmbeddedTimelineService> timelineService) {
     super(context, writeConfig, timelineService);
+    this.observabilityMetrics = (HoodieObservabilityMetrics) Registry.getRegistry(
+        HoodieObservabilityStat.OBSERVABILITY_REGISTRY_NAME, HoodieObservabilityMetrics.class.getName());
+    observabilityMetrics.registerWithSpark(context, config);
   }
 
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig writeConfig,
                              Option<EmbeddedTimelineService> timelineService) {
     super(context, writeConfig, timelineService);
+    this.observabilityMetrics = (HoodieObservabilityMetrics) Registry.getRegistry(

Review comment:
       There is an initialize function where other metric registries are initialized. Better to move this code there.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-786454326


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=h1) Report
   > Merging [#2607](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=desc) (84ce9ba) into [master](https://codecov.io/gh/apache/hudi/commit/1277c62398c58690cd5a6aa78048335d5313ca05?el=desc) (1277c62) will **increase** coverage by `18.20%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2607/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2607       +/-   ##
   =============================================
   + Coverage     51.73%   69.94%   +18.20%     
   + Complexity     3594      366     -3228     
   =============================================
     Files           475       53      -422     
     Lines         22525     1963    -20562     
     Branches       2402      235     -2167     
   =============================================
   - Hits          11654     1373    -10281     
   + Misses         9857      458     -9399     
   + Partials       1014      132      -882     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `69.94% <ø> (ø)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...a/org/apache/hudi/common/util/ValidationUtils.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvVmFsaWRhdGlvblV0aWxzLmphdmE=) | | | |
   | [...java/org/apache/hudi/sink/StreamWriteOperator.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlT3BlcmF0b3IuamF2YQ==) | | | |
   | [...rg/apache/hudi/hadoop/HoodieHFileRecordReader.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0hvb2RpZUhGaWxlUmVjb3JkUmVhZGVyLmphdmE=) | | | |
   | [.../java/org/apache/hudi/common/util/RateLimiter.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvUmF0ZUxpbWl0ZXIuamF2YQ==) | | | |
   | [...org/apache/hudi/common/table/log/AppendResult.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9BcHBlbmRSZXN1bHQuamF2YQ==) | | | |
   | [...apache/hudi/common/engine/TaskContextSupplier.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2VuZ2luZS9UYXNrQ29udGV4dFN1cHBsaWVyLmphdmE=) | | | |
   | [...a/org/apache/hudi/common/util/CompactionUtils.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ29tcGFjdGlvblV0aWxzLmphdmE=) | | | |
   | [...ache/hudi/common/table/timeline/HoodieInstant.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZUluc3RhbnQuamF2YQ==) | | | |
   | [.../java/org/apache/hudi/HoodieDataSourceHelpers.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvSG9vZGllRGF0YVNvdXJjZUhlbHBlcnMuamF2YQ==) | | | |
   | [...va/org/apache/hudi/common/fs/ConsistencyGuard.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0NvbnNpc3RlbmN5R3VhcmQuamF2YQ==) | | | |
   | ... and [412 more](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree-more) | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io commented on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-786454326


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=h1) Report
   > Merging [#2607](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=desc) (ab93c26) into [master](https://codecov.io/gh/apache/hudi/commit/022df0d1b134422f7b6f305cd7ec04b25caa23f0?el=desc) (022df0d) will **decrease** coverage by `41.64%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2607/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #2607       +/-   ##
   ============================================
   - Coverage     51.26%   9.61%   -41.65%     
   + Complexity     3241      48     -3193     
   ============================================
     Files           438      53      -385     
     Lines         20126    1944    -18182     
     Branches       2079     235     -1844     
   ============================================
   - Hits          10318     187    -10131     
   + Misses         8955    1744     -7211     
   + Partials        853      13      -840     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.61% <ø> (-59.83%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | ... and [415 more](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-786454326


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=h1) Report
   > Merging [#2607](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=desc) (ab93c26) into [master](https://codecov.io/gh/apache/hudi/commit/022df0d1b134422f7b6f305cd7ec04b25caa23f0?el=desc) (022df0d) will **increase** coverage by `18.28%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2607/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2607       +/-   ##
   =============================================
   + Coverage     51.26%   69.54%   +18.28%     
   + Complexity     3241      363     -2878     
   =============================================
     Files           438       53      -385     
     Lines         20126     1944    -18182     
     Branches       2079      235     -1844     
   =============================================
   - Hits          10318     1352     -8966     
   + Misses         8955      458     -8497     
   + Partials        853      134      -719     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `69.54% <ø> (+0.10%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2607?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0hvb2RpZUJ1bGtJbnNlcnREYXRhSW50ZXJuYWxXcml0ZXJGYWN0b3J5LmphdmE=) | | | |
   | [...in/java/org/apache/hudi/common/model/BaseFile.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0Jhc2VGaWxlLmphdmE=) | | | |
   | [...di/common/table/timeline/HoodieActiveTimeline.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZUFjdGl2ZVRpbWVsaW5lLmphdmE=) | | | |
   | [...oop/realtime/HoodieParquetRealtimeInputFormat.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL0hvb2RpZVBhcnF1ZXRSZWFsdGltZUlucHV0Rm9ybWF0LmphdmE=) | | | |
   | [...mmon/table/log/HoodieUnMergedLogRecordScanner.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVVbk1lcmdlZExvZ1JlY29yZFNjYW5uZXIuamF2YQ==) | | | |
   | [...e/timeline/versioning/clean/CleanPlanMigrator.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvY2xlYW4vQ2xlYW5QbGFuTWlncmF0b3IuamF2YQ==) | | | |
   | [...n/scala/org/apache/hudi/HoodieSparkSqlWriter.scala](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrU3FsV3JpdGVyLnNjYWxh) | | | |
   | [...common/table/view/AbstractTableFileSystemView.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvQWJzdHJhY3RUYWJsZUZpbGVTeXN0ZW1WaWV3LmphdmE=) | | | |
   | [.../hadoop/utils/HoodieRealtimeRecordReaderUtils.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZVJlYWx0aW1lUmVjb3JkUmVhZGVyVXRpbHMuamF2YQ==) | | | |
   | [...util/jvm/OpenJ9MemoryLayoutSpecification32bit.java](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvanZtL09wZW5KOU1lbW9yeUxheW91dFNwZWNpZmljYXRpb24zMmJpdC5qYXZh) | | | |
   | ... and [376 more](https://codecov.io/gh/apache/hudi/pull/2607/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-848687118


   @prashantwason / @n3nash  : I see you have reviewed this before. Can either of you follow up and take it to completion. Would be good to close it out.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nbalajee commented on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
nbalajee commented on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-818309223


   @prashantwason @n3nash  - updated the diff addressing the comments.  Added bloomIndex metrics.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nbalajee commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
nbalajee commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r595568718



##########
File path: hudi-client/hudi-client-common/pom.xml
##########
@@ -195,6 +195,11 @@
       <artifactId>junit-platform-commons</artifactId>
       <scope>test</scope>
     </dependency>
+      <dependency>

Review comment:
       Removed the dependency.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on a change in pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#discussion_r601926515



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java
##########
@@ -336,8 +340,12 @@ public void write(GenericRecord oldRecord) {
       }
 
       long fileSizeInBytes = FSUtils.getFileSize(fs, newFilePath);
-      HoodieWriteStat stat = writeStatus.getStat();
 
+      // record write metrics
+      executorMetrics.recordWriteMetrics(getIOType(), InetAddress.getLocalHost().getHostName(),

Review comment:
       private Map<String, List<Tuple>> {
   list.add("io_type", getIOType());
   ..
   ..
   return new Map(CONSTANTS.WRITE_METRICS, list)
   }




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #2607: [HUDI-1643] Hudi observability - framework to report stats from execu…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #2607:
URL: https://github.com/apache/hudi/pull/2607#issuecomment-961587611


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "189503c28d07f874827e91479d78a4869d8b491c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084",
       "triggerID" : "189503c28d07f874827e91479d78a4869d8b491c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 189503c28d07f874827e91479d78a4869d8b491c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2084) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org