You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Weijie Guo (Jira)" <ji...@apache.org> on 2023/03/21 07:57:00 UTC

[jira] [Assigned] (FLINK-30556) Improve the logic for enumerating splits for Hive source to avoid potential OOM

     [ https://issues.apache.org/jira/browse/FLINK-30556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Weijie Guo reassigned FLINK-30556:
----------------------------------

    Assignee: Wencong Liu

> Improve the logic for enumerating splits for Hive source to avoid potential OOM
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-30556
>                 URL: https://issues.apache.org/jira/browse/FLINK-30556
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / Hive
>    Affects Versions: 1.16.0
>            Reporter: luoyuxia
>            Assignee: Wencong Liu
>            Priority: Major
>              Labels: pull-request-available
>
> Currently, when read hive source in batch mode, it'll first enumerate all split for the hive table. But when the table is large, the split will be too many which may well cause OOM. Some commuity users has also reported this problem. 
> We need to optimize the logic for enumerating splits for hive table source to avoid potential OOM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)