You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "Abacn (via GitHub)" <gi...@apache.org> on 2023/05/03 00:33:14 UTC

[GitHub] [beam] Abacn commented on issue #26395: [Bug]: Possibly unnecessary prefetch during GroupIntoBatches

Abacn commented on issue #26395:
URL: https://github.com/apache/beam/issues/26395#issuecomment-1532313572

   Thanks for reporting this. This if block exists at the first place when `GroupIntoBatches` was implemented (#2610), assuming there was some reason for that. `WithShardedKey` was added some time later (still 3y ago). Haven't looked into detail, would like to dig into it.
   
   > "Total streaming data processed" metric was so much higher (2-4× depending on the pipeline) than the actual data
   
   Have you tested that this is caused by the pointed code path? Would be nice if there is some benchmark data to share with
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org