You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/05 13:18:29 UTC

[GitHub] [beam] ryantam626 opened a new issue, #22159: [Bug]: Performance degradation in Dataflow job when using `grpcio==1.45.0`

ryantam626 opened a new issue, #22159:
URL: https://github.com/apache/beam/issues/22159

   ### What happened?
   
   Runner: Dataflow runner
   SDK: Python
   Version: 2.38.0
   
   I recently swapped to using poetry for Python dependency management (and thus implicit deps have been inadvertently upgraded), and noticed a significant performance degradation with this new setup. After a lot of binary chopping, I have come to the conclusion that upgrading from `grpcio==1.44.0` to `grpcio==1.45.0` probably caused the degradation.
   
   I don't have capacity to provide a reproducible example nor debug further, apologies, hopefully this is enough.
   
   Here are some interesting screenshots:
   
   Dataflow job CPU util pattern with `grpcio==1.44.0`
   ![Selection_502](https://user-images.githubusercontent.com/8895126/177336591-1aa66411-97e7-4511-9a96-9db43de89efb.png)
   
   Dataflow job CPU util pattern with `grpcio==1.45.0`
   ![Selection_500](https://user-images.githubusercontent.com/8895126/177336610-698b866a-a14b-4300-951e-b6c21f98194e.png)
   
   Notice how the CPU utilisation is never capped at 100% in the second screenshot, they are both working on the exact same set of input data, exact same code except with `grpcio` and `grpcio-status` version upgraded.
   
   
   ### Issue Priority
   
   Priority: 2
   
   ### Issue Component
   
   Component: runner-dataflow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ryantam626 commented on issue #22159: [Bug]: Performance degradation in Dataflow job when using `grpcio==1.45.0`

Posted by GitBox <gi...@apache.org>.
ryantam626 commented on issue #22159:
URL: https://github.com/apache/beam/issues/22159#issuecomment-1337722882

   I am still currently using `grpcio==1.44.0` at the moment without any performance degradation.
   I currently still don't have bandwidth to experiment with a newer `grpcio` package.
   I will report back once I have time to experiment with this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #22159: [Bug]: Performance degradation in Dataflow job when using `grpcio==1.45.0`

Posted by GitBox <gi...@apache.org>.
kennknowles commented on issue #22159:
URL: https://github.com/apache/beam/issues/22159#issuecomment-1335956525

   Is this resolved now or still an issue? Have we determined this is gRPC version and discussed with upstream?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #22159: [Bug]: Performance degradation in Dataflow job when using `grpcio==1.45.0`

Posted by GitBox <gi...@apache.org>.
Abacn commented on issue #22159:
URL: https://github.com/apache/beam/issues/22159#issuecomment-1198560771

   grpcio 1.45 is marked as ["yanked"](https://pypi.org/project/grpcio/1.45.0/) and should not be used. Could you please try if the issue persists using grpcio 1.46.0 ? If so it might be related to #22283


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ryantam626 commented on issue #22159: [Bug]: Performance degradation in Dataflow job when using `grpcio==1.45.0`

Posted by GitBox <gi...@apache.org>.
ryantam626 commented on issue #22159:
URL: https://github.com/apache/beam/issues/22159#issuecomment-1199125464

   I have also tried 1.46.3 before (this was the version poetry pulled without me pinning the package version of `grpcio`), same degraded performance was observed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #22159: [Bug]: Performance degradation in Dataflow job when using `grpcio==1.45.0`

Posted by GitBox <gi...@apache.org>.
Abacn commented on issue #22159:
URL: https://github.com/apache/beam/issues/22159#issuecomment-1211031330

   @aaltay For question raised in #22283 performance regression is tracked here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org