You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "wjones127 (via GitHub)" <gi...@apache.org> on 2023/02/06 16:32:46 UTC

[GitHub] [arrow] wjones127 commented on a diff in pull request #33731: GH-15231: [C++][Benchmarking] Add new memory pool metrics and track in benchmarks

wjones127 commented on code in PR #33731:
URL: https://github.com/apache/arrow/pull/33731#discussion_r1097626148


##########
cpp/src/arrow/util/benchmark_util.h:
##########
@@ -136,4 +139,62 @@ struct RegressionArgs {
   bool size_is_bytes_;
 };
 
+class MemoryPoolMemoryManager : public benchmark::MemoryManager {
+  void Start() override {
+    memory_pool = std::make_shared<ProxyMemoryPool>(default_memory_pool());
+
+    MemoryPool* default_pool = default_memory_pool();
+    global_allocations_start = default_pool->num_allocations();
+  }
+
+  void Stop(benchmark::MemoryManager::Result* result) override {
+    // If num_allocations is still zero, we assume that the memory pool wasn't passed down
+    // so we should record them.
+    MemoryPool* default_pool = default_memory_pool();
+    int64_t new_default_allocations =
+        default_pool->num_allocations() - global_allocations_start;
+
+    // Only record metrics metrics if (1) there were allocations and (2) we
+    // recorded at least one.
+    if (new_default_allocations > 0 && memory_pool->num_allocations() > 0) {
+      if (new_default_allocations > memory_pool->num_allocations()) {
+        // If we missed some, let's report that.
+        int64_t missed_allocations =
+            new_default_allocations - memory_pool->num_allocations();
+        ARROW_LOG(WARNING) << "BenchmarkMemoryTracker recorded some allocations "
+                           << "for a benchmark, but missed " << missed_allocations
+                           << " allocations.\n";
+      }
+
+      result->max_bytes_used = memory_pool->max_memory();
+      result->total_allocated_bytes = memory_pool->total_bytes_allocated();
+      result->num_allocs = memory_pool->num_allocations();
+    }
+  }
+
+ public:
+  std::shared_ptr<::arrow::ProxyMemoryPool> memory_pool;
+
+ protected:
+  int64_t global_allocations_start;
+};
+
+/// \brief Track memory pool allocations in benchmarks.
+///
+/// Instantiate as a global variable to register the hooks into Google Benchmark
+/// to collect memory metrics. Before each benchmark, a new ProxyMemoryPool is
+/// created. It can then be accessed with memory_pool(). Once the benchmark is
+/// complete, the hook will record the maximum memory used, the total bytes
+/// allocated, and the total number of allocations. If no allocations were seen,
+/// (for example, if you forgot to pass down the memory pool), then these metrics
+/// will not be saved.

Review Comment:
   Yeah that's a good point. These measurements won't work if multiple benchmarks are run concurrently. By default they aren't but I think you could turn it on from the CLI. It would be nice if I could detect that and emit a warning / turn off memory tracking in that case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org