You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Sam Whittle (Jira)" <ji...@apache.org> on 2022/01/06 10:13:00 UTC

[jira] [Reopened] (BEAM-12776) Improve parallelism of closing files in FileIO

     [ https://issues.apache.org/jira/browse/BEAM-12776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sam Whittle reopened BEAM-12776:
--------------------------------

There is a large buffer by default for GCS writes and closing all windows in parallel can increase memory usage and trigger OOMs.  I plan on changing this to limit parallelism to a certain amount, and allow that parallelism to be controlled by an option.

> Improve parallelism of closing files in FileIO
> ----------------------------------------------
>
>                 Key: BEAM-12776
>                 URL: https://issues.apache.org/jira/browse/BEAM-12776
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-files
>            Reporter: Sam Whittle
>            Assignee: Sam Whittle
>            Priority: P2
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently close happens in processElement which is per-window.
> If there are many windows firing this can throttle throughput waiting for IO instead of closing in parallel in finishBundle.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)