You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/07/31 02:00:45 UTC

[GitHub] [arrow] cyb70289 commented on a change in pull request #7863: ARROW-9344: [C++][Flight] Measure latency quantiles

cyb70289 commented on a change in pull request #7863:
URL: https://github.com/apache/arrow/pull/7863#discussion_r463366242



##########
File path: cpp/src/arrow/flight/flight_benchmark.cc
##########
@@ -62,20 +69,40 @@ struct PerformanceResult {
 };
 
 struct PerformanceStats {
-  PerformanceStats() {}
+  using accumulator_type = acc::accumulator_set<
+      double, acc::stats<acc::tag::extended_p_square_quantile(acc::quadratic),
+                         acc::tag::mean, acc::tag::max>>;
+
+  PerformanceStats() : latencies(acc::extended_p_square_probabilities = quantiles) {}
   std::mutex mutex;
   int64_t total_batches = 0;
   int64_t total_records = 0;
   int64_t total_bytes = 0;
-  uint64_t total_nanos = 0;
+  const std::array<double, 3> quantiles = {0.5, 0.95, 0.99};
+  accumulator_type latencies;
 
-  void Update(int64_t total_batches, int64_t total_records, int64_t total_bytes,
-              uint64_t total_nanos) {
+  void Update(int64_t total_batches, int64_t total_records, int64_t total_bytes) {
     std::lock_guard<std::mutex> lock(this->mutex);
     this->total_batches += total_batches;
     this->total_records += total_records;
     this->total_bytes += total_bytes;
-    this->total_nanos += total_nanos;
+  }
+
+  // Invoked per batch in the test loop. Holding a lock looks not scalable.
+  // Tested with 1 ~ 8 threads, no noticeable overhead is observed.
+  // A better approach may be leveraging a lockless queue.

Review comment:
       Hmm, merging quantiles looks not straightforward, boost is using [p-square](https://www.cse.wustl.edu/~jain/papers/ftp/psqr.pdf) algorithm which *estimates* the quantile without storing any value.
   If there's no much variance among threads, maybe we can simply average per thread quantiles or pick the max. Will do some investigations.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org