You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Anonymous (Jira)" <ji...@apache.org> on 2023/04/13 10:56:00 UTC

[jira] [Updated] (BEAM-7442) Bounded Reads for Flink Runner fails with OOM

     [ https://issues.apache.org/jira/browse/BEAM-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anonymous updated BEAM-7442:
----------------------------
    Status: Triage Needed  (was: Resolved)

> Bounded Reads for Flink Runner fails with OOM
> ---------------------------------------------
>
>                 Key: BEAM-7442
>                 URL: https://issues.apache.org/jira/browse/BEAM-7442
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Akshay Iyangar
>            Assignee: Akshay Iyangar
>            Priority: P2
>             Fix For: 2.14.0
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When Flink runner is reading from a bounded source and if the total number of files are huge and the count is more. FlinkRunner throws an OOM error. This is happening because the current implementation doesn't read them sequentially but simultaneously thus causing all of the files to be in memory which quickly breaks the cluster.
> Solution : To wrap `UnboundedReadFromBoundedSource` class by a wrapper to see that when the stream is a bounded source we make it read it sequentially using a queue.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)