You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by GitBox <gi...@apache.org> on 2023/01/06 20:54:59 UTC

[GitHub] [arrow] wjones127 opened a new issue, #15231: [Benchmarking][C++] Track memory usage in C++ microbenchmarks

wjones127 opened a new issue, #15231:
URL: https://github.com/apache/arrow/issues/15231

   ### Describe the enhancement requested
   
   Google Bench supports providing a `MemoryManager` class to its `RegisterMemoryManager()` method. Then each benchmark will report the following in the JSON output:
   
    * `allocs_per_iter`
    * `max_bytes_used`
    * `total_allocated_bytes` (if available)
    * `net_heap_growth` (if available)
   
   Perhaps we can setup a `arrow::MemoryPool` that is a `MemoryManager`?
   
   See:
   https://github.com/google/benchmark/blob/main/docs/user_guide.md#memory-usage
   https://github.com/google/benchmark/blob/62edc4fb00e1aeab86cc69c70eafffb17219d047/src/json_reporter.cc#L291-L305
   
   Not sure, but I think once it's in the output in the JSON then Conbench will have it saved. ([It seems to save the whole JSON output](https://github.com/voltrondata-labs/benchmarks/blob/e8a8d415af589744a5a455a16701f244a8a7cd63/benchmarks/cpp_micro_benchmarks.py#L134).) And then it's a matter of processing and rendering it in Conbench. @jonkeane do I understand correctly?
   
   ### Component(s)
   
   Benchmarking, C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jonkeane commented on issue #15231: [Benchmarking][C++] Track memory usage in C++ microbenchmarks

Posted by GitBox <gi...@apache.org>.
jonkeane commented on issue #15231:
URL: https://github.com/apache/arrow/issues/15231#issuecomment-1379392312

   Somewhere a around https://github.com/conbench/conbench/blob/fed3d11910280e4bda80d2e0274cdf239f0d0f09/benchadapt/python/benchadapt/adapters/gbench.py#L203-L228 is where we would add this to the adapter (though we'll need to make the changes to the conbench backend before we do that. Would you mind opening an issue at https://github.com/conbench/conbench/issues to support storing these memory metrics? We can hash out there what that will look like and how we can get to that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] wjones127 commented on issue #15231: [Benchmarking][C++] Track memory usage in C++ microbenchmarks

Posted by GitBox <gi...@apache.org>.
wjones127 commented on issue #15231:
URL: https://github.com/apache/arrow/issues/15231#issuecomment-1376592183

   > Though I do wonder how many memory issues would be revealed at the micro-benchmark level
   
   It's a good question. One thing I have noticed when I do measure memory usage is it tends to be very deterministic. So it's a lot easier to detect regressions and improvements for memory usage than it is for time-based metrics (which can be very noisy).
   
   But it's also the case that memory issues can happen due to the interaction between systems, rather than just within isolated components. If we are using Google bench for any macro-benchmarks happy to apply it there as well.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #15231: [Benchmarking][C++] Track memory usage in C++ microbenchmarks

Posted by GitBox <gi...@apache.org>.
westonpace commented on issue #15231:
URL: https://github.com/apache/arrow/issues/15231#issuecomment-1376572429

   This would be great.  Though I do wonder how many memory issues would be revealed at the micro-benchmark level vs. at the macro-benchmark level (that being said, features like this make google benchmark convenient for macro-benchmarks as well)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] wjones127 commented on issue #15231: [Benchmarking][C++] Track memory usage in C++ microbenchmarks

Posted by GitBox <gi...@apache.org>.
wjones127 commented on issue #15231:
URL: https://github.com/apache/arrow/issues/15231#issuecomment-1377899563

   Yeah the schema is basically:
   
   ```
   allocs_per_iter: double
   max_bytes_used: int64_t
   total_allocated_bytes: int64_t
   net_heap_growth: int64_t
   ```
   
   The latter two aren't always output, but you'll probably just treat them all as optional.
   
   Here's an example from their unit tests:
   
   https://github.com/google/benchmark/blob/62edc4fb00e1aeab86cc69c70eafffb17219d047/test/memory_manager_test.cc#L23-L37
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jonkeane commented on issue #15231: [Benchmarking][C++] Track memory usage in C++ microbenchmarks

Posted by GitBox <gi...@apache.org>.
jonkeane commented on issue #15231:
URL: https://github.com/apache/arrow/issues/15231#issuecomment-1377846463

   > Not sure, but I think once it's in the output in the JSON then Conbench will have it saved. ([It seems to save the whole JSON output](https://github.com/voltrondata-labs/benchmarks/blob/e8a8d415af589744a5a455a16701f244a8a7cd63/benchmarks/cpp_micro_benchmarks.py#L134).) And then it's a matter of processing and rendering it in Conbench. @jonkeane do I understand correctly?
   
   Yup, that's basically it. We'll need to also add a place to put it in the conbench backend, but I (or someone on my team) would be very happy to help get that setup so that we can pass things like this on and persist it in conbench itself. Do you happen to have any output of what this would look like? If you did we could look at what would need to be added to the archery <-> conbench adapter that we have so that it could be passed on as soon as we have a place to store it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jonkeane commented on issue #15231: [Benchmarking][C++] Track memory usage in C++ microbenchmarks

Posted by GitBox <gi...@apache.org>.
jonkeane commented on issue #15231:
URL: https://github.com/apache/arrow/issues/15231#issuecomment-1379388634

   Ah great, that link was super helpful to see that it looks like they're all returned at the same level as `real_time` and `cpu_time `, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] wjones127 closed issue #15231: [Benchmarking][C++] Track memory usage in C++ microbenchmarks

Posted by "wjones127 (via GitHub)" <gi...@apache.org>.
wjones127 closed issue #15231: [Benchmarking][C++] Track memory usage in C++ microbenchmarks
URL: https://github.com/apache/arrow/issues/15231


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org