You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/11/16 22:00:29 UTC
[GitHub] [beam] damccorm commented on a diff in pull request #24216: Document our benchmarks

damccorm commented on code in PR #24216:
URL: https://github.com/apache/beam/pull/24216#discussion_r1024548315


##########
sdks/python/apache_beam/testing/benchmarks/inference/README.md:
##########
@@ -0,0 +1,34 @@
+# RunInference Benchmarks
+
+This module contains benchmarks used to test the performance of the RunInference transform
+running inference with common models and frameworks. Each benchmark is explained in detail
+below. Beam's performance over time can be viewed at http://s.apache.org/beam-community-metrics/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1
+
+## Pytorch RunInference Image Classification 50K
+
+The Pytorch RunInference Image Classification 50K benchmark runs an
+[example image classification pipeline](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/pytorch_image_classification.py)
+using various different resnet image classification models (the benchmarks on
+[Beam's dashboard](http://s.apache.org/beam-community-metrics/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1)
+display [resnet101](https://huggingface.co/microsoft/resnet-101) and [resnet152](https://huggingface.co/microsoft/resnet-152))
+against 50,000 example images from the OpenImage dataset. The benchmarks produce
+the following metrics:
+
+- Mean Inference Requested Batch Size - the average batch size that RunInference groups the images into for batch prediction
+- Mean Inference Batch Latency - the average amount of time it takes to perform inference on a given batch of images
+- Mean Load Model Latency - the average amount of time it takes to load a model. This is done once per DoFn instance on worker
+startup, so the cost is amortized across the pipeline.
+
+## Pytorch RunInference Language Modeling
+
+The Pytorch RunInference Language Modeling benchmark runs an
+[example language modeling pipeline](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/pytorch_language_modeling.py)
+using the [Bert large uncased](https://huggingface.co/bert-large-uncased)
+and [Bert base uncased](https://huggingface.co/bert-base-uncased) models
+and a dataset of 50,000 sentences from the TODO dataset. The benchmarks produce

Review Comment:
   TODO must be resolved before merging



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org