You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/07/30 07:34:55 UTC

[GitHub] [arrow] cyb70289 opened a new pull request #7863: ARROW-9344: [C++][Flight] Measure latency quantiles

cyb70289 opened a new pull request #7863:
URL: https://github.com/apache/arrow/pull/7863


   Measure average, median, 95%, 99% quantile and maximal latency.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on pull request #7863: ARROW-9344: [C++][Flight] Measure latency quantiles

Posted by GitBox <gi...@apache.org>.

lidavidm commented on pull request #7863:
URL: https://github.com/apache/arrow/pull/7863#issuecomment-667190655


   Thank you!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on pull request #7863: ARROW-9344: [C++][Flight] Measure latency quantiles

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on pull request #7863:
URL: https://github.com/apache/arrow/pull/7863#issuecomment-666862103


   > Looks good overall (one minor nit). Do you have sample results?
   
   Test result on my host
   ```
   $ release/arrow-flight-benchmark 
   Using standalone server: false
   Server running with pid 21435
   Testing method: DoGet
   Server host: localhost
   Server port: 31337
   Server host: localhost
   Server port: 31337
   Batch size: 131040
   Batches read: 9768
   Bytes read: 1280000000
   Nanos: 151282157
   Speed: 8069.05 MB/s
   Throughput: 64568.1 batches/s
   Latency mean: 52 uses
   Latency quantile=0.5: 44 usecs
   Latency quantile=0.95: 96 usecs
   Latency quantile=0.99: 1918 usecs
   Latency max: 8513 uses
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on pull request #7863: ARROW-9344: [C++][Flight] Measure latency quantiles

Posted by GitBox <gi...@apache.org>.

lidavidm commented on pull request #7863:
URL: https://github.com/apache/arrow/pull/7863#issuecomment-666374219


   > Please **NOTE** this patch uses `boost::accumulators` package, have to install libboost-dev package before building.
   
   Hmm, the build system should take care of getting boost for you (unless you intentionally are building with system dependencies, which is OK).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #7863: ARROW-9344: [C++][Flight] Measure latency quantiles

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #7863:
URL: https://github.com/apache/arrow/pull/7863#discussion_r463021265



##########
File path: cpp/src/arrow/flight/flight_benchmark.cc
##########
@@ -281,9 +313,12 @@ Status RunPerformanceTest(FlightClient* client, bool test_put) {
   // Calculate throughput(IOPS) and latency vs batch size
   std::cout << "Throughput: " << (static_cast<double>(stats.total_batches) / time_elapsed)
             << " batches/s" << std::endl;
-  std::cout << "Latency: "
-            << (static_cast<double>(stats.total_nanos) / 1000 / stats.total_batches)
-            << " usec/batch" << std::endl;
+  std::cout << "Latency mean: " << stats.mean_latency() << " uses" << std::endl;

Review comment:
       nit: `uses` -> `usec` (or just `us`) and the same below

##########
File path: cpp/src/arrow/flight/flight_benchmark.cc
##########
@@ -62,20 +69,40 @@ struct PerformanceResult {
 };
 
 struct PerformanceStats {
-  PerformanceStats() {}
+  using accumulator_type = acc::accumulator_set<
+      double, acc::stats<acc::tag::extended_p_square_quantile(acc::quadratic),
+                         acc::tag::mean, acc::tag::max>>;
+
+  PerformanceStats() : latencies(acc::extended_p_square_probabilities = quantiles) {}
   std::mutex mutex;
   int64_t total_batches = 0;
   int64_t total_records = 0;
   int64_t total_bytes = 0;
-  uint64_t total_nanos = 0;
+  const std::array<double, 3> quantiles = {0.5, 0.95, 0.99};
+  accumulator_type latencies;
 
-  void Update(int64_t total_batches, int64_t total_records, int64_t total_bytes,
-              uint64_t total_nanos) {
+  void Update(int64_t total_batches, int64_t total_records, int64_t total_bytes) {
     std::lock_guard<std::mutex> lock(this->mutex);
     this->total_batches += total_batches;
     this->total_records += total_records;
     this->total_bytes += total_bytes;
-    this->total_nanos += total_nanos;
+  }
+
+  // Invoked per batch in the test loop. Holding a lock looks not scalable.
+  // Tested with 1 ~ 8 threads, no noticeable overhead is observed.
+  // A better approach may be leveraging a lockless queue.

Review comment:
       If this is a concern, we could make this object per-thread and merge at the end.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm closed pull request #7863: ARROW-9344: [C++][Flight] Measure latency quantiles

Posted by GitBox <gi...@apache.org>.

lidavidm closed pull request #7863:
URL: https://github.com/apache/arrow/pull/7863


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] github-actions[bot] commented on pull request #7863: ARROW-9344: [C++][Flight] Measure latency quantiles

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on pull request #7863:
URL: https://github.com/apache/arrow/pull/7863#issuecomment-666171816


   https://issues.apache.org/jira/browse/ARROW-9344


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on a change in pull request #7863: ARROW-9344: [C++][Flight] Measure latency quantiles

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on a change in pull request #7863:
URL: https://github.com/apache/arrow/pull/7863#discussion_r463366242



##########
File path: cpp/src/arrow/flight/flight_benchmark.cc
##########
@@ -62,20 +69,40 @@ struct PerformanceResult {
 };
 
 struct PerformanceStats {
-  PerformanceStats() {}
+  using accumulator_type = acc::accumulator_set<
+      double, acc::stats<acc::tag::extended_p_square_quantile(acc::quadratic),
+                         acc::tag::mean, acc::tag::max>>;
+
+  PerformanceStats() : latencies(acc::extended_p_square_probabilities = quantiles) {}
   std::mutex mutex;
   int64_t total_batches = 0;
   int64_t total_records = 0;
   int64_t total_bytes = 0;
-  uint64_t total_nanos = 0;
+  const std::array<double, 3> quantiles = {0.5, 0.95, 0.99};
+  accumulator_type latencies;
 
-  void Update(int64_t total_batches, int64_t total_records, int64_t total_bytes,
-              uint64_t total_nanos) {
+  void Update(int64_t total_batches, int64_t total_records, int64_t total_bytes) {
     std::lock_guard<std::mutex> lock(this->mutex);
     this->total_batches += total_batches;
     this->total_records += total_records;
     this->total_bytes += total_bytes;
-    this->total_nanos += total_nanos;
+  }
+
+  // Invoked per batch in the test loop. Holding a lock looks not scalable.
+  // Tested with 1 ~ 8 threads, no noticeable overhead is observed.
+  // A better approach may be leveraging a lockless queue.

Review comment:
       Hmm, merging quantiles looks not straightforward, boost is using [p-square](https://www.cse.wustl.edu/~jain/papers/ftp/psqr.pdf) algorithm which *estimates* the quantile without storing any value.
   If there's no much variance among threads, maybe we can simply average per thread quantiles or pick the max. Will do some investigations.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org