You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/03/04 09:41:01 UTC

[jira] [Work logged] (BEAM-6627) Use Metrics API in IO performance tests

     [ https://issues.apache.org/jira/browse/BEAM-6627?focusedWorklogId=207054&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-207054 ]

ASF GitHub Bot logged work on BEAM-6627:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Mar/19 09:40
            Start Date: 04/Mar/19 09:40
    Worklog Time Spent: 10m 
      Work Description: mwalenia commented on pull request #7772: [BEAM-6627] Added Metrics API processing time reporting to TextIOIT
URL: https://github.com/apache/beam/pull/7772#discussion_r261982116
 
 

 ##########
 File path: sdks/java/io/file-based-io-tests/src/test/java/org/apache/beam/sdk/io/text/TextIOIT.java
 ##########
 @@ -127,28 +140,49 @@ public void writeThenReadAll() {
 
     PipelineResult result = pipeline.run();
     result.waitUntilFinish();
-    publishGcsResults(result);
+    gatherAndPublishMetrics(result);
   }
 
-  private void publishGcsResults(PipelineResult result) {
+  private void gatherAndPublishMetrics(PipelineResult result) {
+    String uuid = UUID.randomUUID().toString();
+    Timestamp timestamp = Timestamp.now();
+    List<NamedTestResult> namedTestResults = readMetrics(result, uuid, timestamp);
+    publishToBigQuery(namedTestResults, bigQueryDataset, bigQueryTable);
+    ConsoleResultPublisher.publish(namedTestResults, uuid, timestamp.toString());
+  }
+
+  private List<NamedTestResult> readMetrics(
+      PipelineResult result, String uuid, Timestamp timestamp) {
+    List<NamedTestResult> results = new ArrayList<>();
+
+    MetricsReader reader = new MetricsReader(result, FILEIOIT_NAMESPACE);
+    long writeStartTime = reader.getStartTimeMetric("startTime");
+    long writeEndTime = reader.getEndTimeMetric("middleTime");
+    long readStartTime = reader.getStartTimeMetric("middleTime");
+    long readEndTime = reader.getEndTimeMetric("endTime");
+    double writeTime = (writeEndTime - writeStartTime) / 1000.0;
+    double readTime = (readEndTime - readStartTime) / 1000.0;
+    double copiesPerSec = calculateGcsMetric(result);
+
+    if (copiesPerSec > 0) {
+      results.add(
+          NamedTestResult.create(uuid, timestamp.toString(), "copies_per_sec", copiesPerSec));
+    }
+
+    results.add(NamedTestResult.create(uuid, timestamp.toString(), "read_time", readTime));
+    results.add(NamedTestResult.create(uuid, timestamp.toString(), "write_time", writeTime));
+
+    return results;
+  }
+
+  private double calculateGcsMetric(PipelineResult result) {
 
 Review comment:
   @udim A big advantage of using my approach is built-in error reporting and checking. After passing an argument with a typo we get for example:
   
   `java.lang.IllegalArgumentException: Class interface org.apache.beam.sdk.testing.TestPipelineOptions missing a property named 'compresionType'. Did you mean 'compressionType'?`
   
   whereas there's no such mechanism when passing a list of strings as parameter value. In case of a typo we'd have a silent failure, since in order to switch reporting of different metrics on and off, we'd need to search for strings in an array.
   
   I think we should stick with explicitly set flags for subsequent metrics. WDYT?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 207054)
    Time Spent: 6.5h  (was: 6h 20m)

> Use Metrics API in IO performance tests
> ---------------------------------------
>
>                 Key: BEAM-6627
>                 URL: https://issues.apache.org/jira/browse/BEAM-6627
>             Project: Beam
>          Issue Type: Improvement
>          Components: testing
>            Reporter: Michal Walenia
>            Assignee: Michal Walenia
>            Priority: Minor
>          Time Spent: 6.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)