You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Xintong Song (Jira)" <ji...@apache.org> on 2021/01/08 10:38:00 UTC

[jira] [Closed] (FLINK-20860) Allow streaming operators to use managed memory

     [ https://issues.apache.org/jira/browse/FLINK-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xintong Song closed FLINK-20860.
--------------------------------
    Resolution: Done

Merged via
* master (1.13): ed354d9c93b38d843269c29b291271ba8400c7d9

> Allow streaming operators to use managed memory
> -----------------------------------------------
>
>                 Key: FLINK-20860
>                 URL: https://issues.apache.org/jira/browse/FLINK-20860
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Configuration, Runtime / Task
>            Reporter: Jark Wu
>            Assignee: Xintong Song
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.13.0
>
>
> We are planning to use some batch algorithms (sorting & bytes hash table) to improve the performance of streaming SQL operators, especially for the the mini-batch operators introduced by FLIP-145.
> Currently, we have to buffer input records and accumulators in heap (i.e. Java HashMap) which is not efficient and there are potential risks of full GC and OOM. With the managed memory, we can fully use the memory to buffer more data without worrying about OOM and improve the performance a lot. However, the managed memory is not allowed to be used in streaming operators. 
> As discussed in the mailing list [1], we have reached a consensus that we can extend the configuration {{taskmanager.memory.managed.consumer-weights}} to have 2 more options {{OPERATOR}} and {{STATE_BACKEND}}, the available consumer options will be :
> * `OPERATOR` for both streaming and bath operators
> * `STATE_BACKEND` for state backends
> * `PYTHON` for python processes
> * `DATAPROC` as a legacy key for state backend or batch operators if
> `STATE_BACKEND` or `OPERATOR` are not specified.
> The previous default value is {{DATAPROC:70,PYTHON:30}}, the new default value will be {{OPERATOR:70,STATE_BACKEND:70,PYTHON:30}}.
> The weight for OPERATOR and STATE_BACKEND will be the same value to align with previous behaviors.
> [1]: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Allow-streaming-operators-to-use-managed-memory-td47327.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)