You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "AnandInguva (via GitHub)" <gi...@apache.org> on 2023/05/09 23:41:54 UTC

[GitHub] [beam] AnandInguva opened a new issue, #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

AnandInguva opened a new issue, #26611:
URL: https://github.com/apache/beam/issues/26611

   
     Performance change found in the
     test: `Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks` for the metric: `mean_inference_batch_latency_micro_secs`.
   
     For more information on how to triage the alerts, please look at
     `Triage performance alert issues` section of the [README](https://github.com/apache/beam/tree/master/sdks/python/apache_beam/testing/analyzers/README.md#triage-performance-alert-issues).
   
   
   timestamp: Mon May  8 20:21:00 2023, metric_value: `351051.74592690443`
   timestamp: Sun May  7 20:09:42 2023, metric_value: `373425.0474158369`
   timestamp: Sat May  6 20:12:19 2023, metric_value: `242029.10829268294`
   timestamp: Thu May  4 18:12:51 2023, metric_value: `361223.52659574465`
   timestamp: Wed May  3 18:22:09 2023, metric_value: `328118.64653425216`
   timestamp: Tue May  2 18:14:51 2023, metric_value: `424903.4859154929`
   timestamp: Mon May  1 18:14:12 2023, metric_value: `387124.80579584773`
   timestamp: Sun Apr 30 18:09:14 2023, metric_value: `306414.3679245283`
   timestamp: Sat Apr 29 18:14:43 2023, metric_value: `380149.34958979033`
   timestamp: Fri Apr 28 18:11:05 2023, metric_value: `358877.55455659993`
   timestamp: Thu Apr 27 18:19:19 2023, metric_value: `403779.09769008664` <---- Anomaly
   timestamp: Wed Apr 26 18:34:08 2023, metric_value: `71642.43726750085`
   timestamp: Tue Apr 25 18:36:30 2023, metric_value: `78412.06520314548`
   timestamp: Mon Apr 24 18:33:33 2023, metric_value: `72853.56342609324`
   timestamp: Sun Apr 23 18:29:02 2023, metric_value: `79737.66775956284`
   timestamp: Sat Apr 22 18:32:37 2023, metric_value: `83166.35733538699`
   timestamp: Fri Apr 21 18:53:30 2023, metric_value: `72210.17876039304`
   timestamp: Wed Apr 19 18:37:05 2023, metric_value: `79859.03033666297`
   timestamp: Tue Apr 18 18:54:55 2023, metric_value: `72710.99006387508`
   timestamp: Mon Apr 17 18:41:13 2023, metric_value: `72510.55364642994`
   timestamp: Sun Apr 16 18:25:01 2023, metric_value: `66393.77158671587`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1558177355

   Can see a bump on the graph too: http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&from=now-60d&to=now
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1559680696

   @kerrydc 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "AnandInguva (via GitHub)" <gi...@apache.org>.
AnandInguva commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1583074076

   I am testing if this commit https://github.com/apache/beam/commit/1a51ba5edebef9622c016331071764830adffc94 led to the regression.
   
   Test PR: https://github.com/apache/beam/pull/27071


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1583149227

   thanks, do we want tests to use torch==2.0.0?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1558192436

   noting that according to Grafana, the bump happens on Apr 26, while in the description of this bug it's next day possibly timezone difference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1558193755

   https://github.com/apache/beam/commit/1a51ba5edebef9622c016331071764830adffc94 might be related
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "AnandInguva (via GitHub)" <gi...@apache.org>.
AnandInguva commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1583058943

   ![image](https://github.com/apache/beam/assets/34158215/919cd348-321a-4c56-ac6f-807d33c05287)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1558173245

   this looks like a regression, not improvement?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "AnandInguva (via GitHub)" <gi...@apache.org>.
AnandInguva commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1583359472

   Closing this as `no regression in beam. Container used to run the experiment needs to get updated to get latest torch version.` 
   
   I updated the container.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "AnandInguva (via GitHub)" <gi...@apache.org>.
AnandInguva commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1583150829

   >> 
   thanks, do we want tests to use torch==2.0.0?
   
   Yes, we need to build the docker container. https://github.com/apache/beam/pull/27072. I am in the process to do it. right now it is manual effort. we need to automate it for tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva closed issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "AnandInguva (via GitHub)" <gi...@apache.org>.
AnandInguva closed issue #26611: 
  Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

URL: https://github.com/apache/beam/issues/26611


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "AnandInguva (via GitHub)" <gi...@apache.org>.
AnandInguva commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1583084964

   I looked at the worker logs of the job. When there was the performance improvement around March 15th, 2023, I looked at the worker logs and found out dataflow was installing `torch==2.0.0` but now on the latest experiments, I see it started to use `torch=1.12.0`. Because of this, the graph went back again. 
   
   @kerrydc @tvalentyn @damccorm it is not a regression on the Beam.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "AnandInguva (via GitHub)" <gi...@apache.org>.
AnandInguva commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1541025069

   Seems like this is a performance improvement


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #26611: Performance Regression or Improvement: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU:mean_inference_batch_latency_micro_secs

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn commented on issue #26611:
URL: https://github.com/apache/beam/issues/26611#issuecomment-1558193854

   cc: @damccorm @AnandInguva 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org