You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/03/16 14:09:40 UTC

[GitHub] [beam] je-ik commented on pull request #17097: [BEAM-14064] ElasticsearchIO remove bundle based

je-ik commented on pull request #17097:
URL: https://github.com/apache/beam/pull/17097#issuecomment-1069166593


   > b) For anyone using stateful batching before, no performance change would be present. In terms of _Elasticsearch_ performance, state-based batching is highly preferred in my experience. I have been able to improve throughput 100x by using state-based over bundle-based batching. In terms of Beam performance on a runner, I don't have concrete numbers describing the performance delta. I do know that a production workload I manage processes millions of documents (GBs of data) per hour on a single vCPU using stateful batching. In my experience, the process of indexing data into ES has always been heavily IO bound.
   
   Makes sense. Thanks for clarification.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org