You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/05 00:22:53 UTC

[GitHub] [beam] damccorm opened a new issue, #21569: Beam slowness compared to flink-native

damccorm opened a new issue, #21569:
URL: https://github.com/apache/beam/issues/21569

   I tried to compare a very simple beam pipeline with an equivalent flink-native pipeline. Both pipelines read strings from one kafka topic and write them to another topic. I ran each pipeline separately on a single task manager with a single slot and parallelism 1.
   
   Flink native runs 5 times faster than beam: 150,000 strings per second in flink comparing to 30,000 in beam.
   
   When using Avro and schema registry the difference is even more significant - flink native runs 30-80 times faster than beam.
   
   Attached is the java code of both string-to-string pipelines.
   
    
   
    
   
   Imported from Jira [BEAM-14438](https://issues.apache.org/jira/browse/BEAM-14438). Original Jira may contain additional context.
   Reported by: iafek.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ifat-afek commented on issue #21569: Beam slowness compared to flink-native

Posted by GitBox <gi...@apache.org>.
ifat-afek commented on issue #21569:
URL: https://github.com/apache/beam/issues/21569#issuecomment-1159618546

   The pipelines are attached to the jira ticket: https://issues.apache.org/jira/browse/BEAM-14438 
   Thanks :-)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #21569: Beam slowness compared to flink-native

Posted by GitBox <gi...@apache.org>.
kennknowles commented on issue #21569:
URL: https://github.com/apache/beam/issues/21569#issuecomment-1157964395

   It would be very cool to have these pipelines so we can improve their performance. Given those numbers, this is probably very low-hanging fruit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #21569: Beam slowness compared to flink-native

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #21569:
URL: https://github.com/apache/beam/issues/21569#issuecomment-1399657621

   For someone finding this issue relevant to them, adding a pipeline option `--fasterCopy=true` may help. context: #13240


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ifat-afek commented on issue #21569: Beam slowness compared to flink-native

Posted by "ifat-afek (via GitHub)" <gi...@apache.org>.
ifat-afek commented on issue #21569:
URL: https://github.com/apache/beam/issues/21569#issuecomment-1399916871

   Thanks! I already tried fasterCopy and the performance improvement was not significant.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ifat-afek commented on issue #21569: Beam slowness compared to flink-native

Posted by "ifat-afek (via GitHub)" <gi...@apache.org>.
ifat-afek commented on issue #21569:
URL: https://github.com/apache/beam/issues/21569#issuecomment-1401771037

   We had severe performance issues in our pipelines, so we decided to test and compare very simple pipelines in order to try and find the root cause of the problems. Of course, the plan to later on tune the scaling.
   How can I profile the pipeline and understand where the overhead is? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #21569: Beam slowness compared to flink-native

Posted by "kennknowles (via GitHub)" <gi...@apache.org>.
kennknowles commented on issue #21569:
URL: https://github.com/apache/beam/issues/21569#issuecomment-1400744712

   I know that Flink has a highly tuned Kafka connector. And we do expect _some_ overhead since Beam will add a couple layers of virtual method calls. If there are inefficiencies in serialization that will be a serious overhead.
   
   The other thing I should say is of course in streaming we care primarily about cost and maximum scale so limiting everything to 1 task manager etc is not necessarily measuring what you care about. It is still useful for finding overhead, for sure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #21569: Beam slowness compared to flink-native

Posted by "kennknowles (via GitHub)" <gi...@apache.org>.
kennknowles commented on issue #21569:
URL: https://github.com/apache/beam/issues/21569#issuecomment-1400742801

   Have you looked at a profile to see where the overhead is?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org